v1.5.0
1.5.0 - Granite 4 dense support
- ✨ Adds support for granite 4 dense models
- 🚀 Adds support for vllm 0.13.0
- 🚀 Adds support for fms 1.6.0
- 📝 Adds documentation about prefix caching and corrects confusing log messages
What's Changed
- Fix broken references by @rafvasq in #624
- Set default max_num_batched_tokens to 1k for CP by @maxdebayser in #619
- 🚀 Add vllm 0.13.0 support by @joerunde in #623
- 📝 update docs on batching modes by @joerunde in #629
- Update fms to version 1.6.0 by @maxdebayser in #631
- 🔒 uv lock fms 1.6.0 by @joerunde in #632
- Update pyproject vllm max version to 0.13.0 by @tjohnson31415 in #633
- Update logging around chunked prefill enablement by @tjohnson31415 in #628
- ✨ support granite 4 dense by @joerunde in #635
Full Changelog: v1.4.3...v1.5.0