Skip to content

pathupally/SpecDecSelect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpecDecSelect

SpecDecSelect is a lightweight Python CLI for benchmarking speculative decoding draft model candidates against a target Hugging Face decoder-only causal LM.

Status

This is a v1 local benchmarking tool for single-GPU workflows. It uses greedy agreement as a proxy for speculative decoding acceptance and saves each run into a timestamped directory under results/.

Installation

Minimal setup (editable install):

git clone https://github.com/pathupally/SpecDecSelect
cd specdecselect
python3.13 -m venv .venv
source .venv/bin/activate
pip install -e .
cd ..

Optional extras:

pip install -e ".[dev]"
  • Base install: package, CLI, bundled prompt sets, registry data, and runtime/reporting dependencies (torch, transformers, matplotlib, pandas).
  • dev: local build/test tooling.

Supported Python versions are 3.10 through 3.13. Python 3.14 is not declared yet because the upstream ML stack is not reliably available there.

Commands

specdecselect recommend --target Qwen/Qwen2.5-1.5B-Instruct
specdecselect benchmark --target modelA --draft modelB
specdecselect report results/run-id

Examples

specdecselect recommend \
  --target Qwen/Qwen2.5-1.5B-Instruct \
  --device auto \
  --top-k 5
specdecselect recommend \
  --target meta-llama/Llama-3.2-3B-Instruct \
  --include-speculators
specdecselect benchmark \
  --target Qwen/Qwen2.5-1.5B-Instruct \
  --draft Qwen/Qwen2.5-0.5B-Instruct \
  --gamma 1 2 4 6 8

Notes

  • v1 targets single-GPU local benchmarking.
  • Device selection supports auto, cuda, mps, and cpu (auto prefers cuda, then mps, then cpu).
  • Benchmarks use greedy agreement as a speculative decoding proxy.
  • recommend benchmarks all bundled prompt sets by default: chat, code, and math.
  • Default candidate discovery uses dense draft registries in specdecselect/data/candidate_registry.yaml.
  • --include-speculators additionally merges specialized draft checkpoints from specdecselect/data/speculator_registry.yaml.
  • Specialized speculator checkpoints (EAGLE/MLP style) are cataloged and reported, but are rejected for benchmarking under the current HF causal-LM backend path.
  • report regenerates summaries and plots from a saved results/<run_id> directory.

About

SpecDecSelect is a lightweight Python CLI for benchmarking speculative decoding draft model candidates against a target Hugging Face decoder-only causal LM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages