Releases: vipshop/vllm
Releases · vipshop/vllm
v0.9.0.2rc2
Full Changelog: v0.9.2rc1...v0.9.2rc2
v0.9.2rc1
v0.8.6rc1
Full Changelog: v0.8.4rc4...v0.8.6rc1
v0.8.5rc2
v0.8.5rc1
v0.8.4rc4
Full Changelog: v0.8.4rc3...v0.8.4rc4
v0.8.4rc3
Rebase to vllm:main for a more clean codebase
Full Changelog: https://github.com/vipshop/vllm/commits/v0.8.4rc3
v0.8.4rc2
v0.8.4rc1
What's Changed
- [Kernel][VIP] support cuda merge_attn_states kernel, max ~3x improved by @DefTruth in #18
- [Kernel][VIP] support cuda merge_attn_states kernel by @DefTruth in #19
- [Kernel][VIP] dispatch merge_attn_states cuda kernel, half&bf16 by @DefTruth in #20
- [Misc][VIP] Revert to original Tritron merge_attn_states kernel by @DefTruth in #21
Full Changelog: v0.8.4rc0...v0.8.4rc1
v0.8.4rc0
Full Changelog: v0.8.3rc1...v0.8.4rc0