SpecDecSelect

SpecDecSelect is a lightweight Python CLI for benchmarking speculative decoding draft model candidates against a target Hugging Face decoder-only causal LM.

Status

This is a v1 local benchmarking tool for single-GPU workflows. It uses greedy agreement as a proxy for speculative decoding acceptance and saves each run into a timestamped directory under results/.

Installation

Minimal setup (editable install):

git clone https://github.com/pathupally/SpecDecSelect
cd specdecselect
python3.13 -m venv .venv
source .venv/bin/activate
pip install -e .
cd ..

Optional extras:

pip install -e ".[dev]"

Base install: package, CLI, bundled prompt sets, registry data, and runtime/reporting dependencies (torch, transformers, matplotlib, pandas).
dev: local build/test tooling.

Supported Python versions are 3.10 through 3.13. Python 3.14 is not declared yet because the upstream ML stack is not reliably available there.

Commands

specdecselect recommend --target Qwen/Qwen2.5-1.5B-Instruct
specdecselect benchmark --target modelA --draft modelB
specdecselect report results/run-id

Examples

specdecselect recommend \
  --target Qwen/Qwen2.5-1.5B-Instruct \
  --device auto \
  --top-k 5

specdecselect recommend \
  --target meta-llama/Llama-3.2-3B-Instruct \
  --include-speculators

specdecselect benchmark \
  --target Qwen/Qwen2.5-1.5B-Instruct \
  --draft Qwen/Qwen2.5-0.5B-Instruct \
  --gamma 1 2 4 6 8

Notes

v1 targets single-GPU local benchmarking.
Device selection supports auto, cuda, mps, and cpu (auto prefers cuda, then mps, then cpu).
Benchmarks use greedy agreement as a speculative decoding proxy.
recommend benchmarks all bundled prompt sets by default: chat, code, and math.
Default candidate discovery uses dense draft registries in specdecselect/data/candidate_registry.yaml.
--include-speculators additionally merges specialized draft checkpoints from specdecselect/data/speculator_registry.yaml.
Specialized speculator checkpoints (EAGLE/MLP style) are cataloged and reported, but are rejected for benchmarking under the current HF causal-LM backend path.
report regenerates summaries and plots from a saved results/<run_id> directory.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.agents		.agents
data		data
specdecselect		specdecselect
tests		tests
.gitignore		.gitignore
README.md		README.md
pipeline.md		pipeline.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpecDecSelect

Status

Installation

Commands

Examples

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpecDecSelect

Status

Installation

Commands

Examples

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages