SpecDecSelect is a lightweight Python CLI for benchmarking speculative decoding draft model candidates against a target Hugging Face decoder-only causal LM.
This is a v1 local benchmarking tool for single-GPU workflows. It uses greedy agreement as a proxy for speculative decoding acceptance and saves each run into a timestamped directory under results/.
Minimal setup (editable install):
git clone https://github.com/pathupally/SpecDecSelect
cd specdecselect
python3.13 -m venv .venv
source .venv/bin/activate
pip install -e .
cd ..Optional extras:
pip install -e ".[dev]"- Base install: package, CLI, bundled prompt sets, registry data, and runtime/reporting dependencies (
torch,transformers,matplotlib,pandas). dev: local build/test tooling.
Supported Python versions are 3.10 through 3.13. Python 3.14 is not declared yet because the upstream ML stack is not reliably available there.
specdecselect recommend --target Qwen/Qwen2.5-1.5B-Instruct
specdecselect benchmark --target modelA --draft modelB
specdecselect report results/run-idspecdecselect recommend \
--target Qwen/Qwen2.5-1.5B-Instruct \
--device auto \
--top-k 5specdecselect recommend \
--target meta-llama/Llama-3.2-3B-Instruct \
--include-speculatorsspecdecselect benchmark \
--target Qwen/Qwen2.5-1.5B-Instruct \
--draft Qwen/Qwen2.5-0.5B-Instruct \
--gamma 1 2 4 6 8- v1 targets single-GPU local benchmarking.
- Device selection supports
auto,cuda,mps, andcpu(autopreferscuda, thenmps, thencpu). - Benchmarks use greedy agreement as a speculative decoding proxy.
recommendbenchmarks all bundled prompt sets by default: chat, code, and math.- Default candidate discovery uses dense draft registries in
specdecselect/data/candidate_registry.yaml. --include-speculatorsadditionally merges specialized draft checkpoints fromspecdecselect/data/speculator_registry.yaml.- Specialized speculator checkpoints (EAGLE/MLP style) are cataloged and reported, but are rejected for benchmarking under the current HF causal-LM backend path.
reportregenerates summaries and plots from a savedresults/<run_id>directory.