Skip to content

v1.3.0

Choose a tag to compare

@tjohnson31415 tjohnson31415 released this 09 Dec 16:22
· 46 commits to main since this release
30fcc38

This release adds support for Chunked Prefill for non-quantized models that can be enabled with:

VLLM_SPYRE_USE_CHUNKED_PREFILL=1 VLLM_SPYRE_USE_CB=1

What's Changed

Full Changelog: v1.2.3...v1.3.0