Skip to content

Releases: vipshop/vllm

v0.9.0.2rc2

03 Jun 09:12
989dcee

Choose a tag to compare

v0.9.2rc1

11 Jun 10:07
5bc1ad6

Choose a tag to compare

v0.8.6rc1

06 May 03:13
98834fe

Choose a tag to compare

v0.8.5rc2

17 Apr 07:01
4fc1489

Choose a tag to compare

What's Changed

  • [Misc][VIP] ignore marlin_moe_wna16 local gen codes by @DefTruth in #27

Full Changelog: v0.8.4rc4...v0.8.5rc2

v0.8.5rc1

16 Apr 04:03
58f457c

Choose a tag to compare

What's Changed

  • [Bugfix][Kernel][VIP] fix potential cuda graph broken for merge_attn_states kernel by @DefTruth in #26

Full Changelog: v0.8.4rc4...v0.8.5rc1

v0.8.4rc4

13 Apr 07:12
f49e5af

Choose a tag to compare

v0.8.4rc3

12 Apr 04:44
802329d

Choose a tag to compare

Rebase to vllm:main for a more clean codebase
Full Changelog: https://github.com/vipshop/vllm/commits/v0.8.4rc3

v0.8.4rc2

10 Apr 10:51
94680eb

Choose a tag to compare

What's Changed

  • [Kernel] optimize merge_attn_states CUDA kernel dispatch by @DefTruth in #22
  • [Update][VIP] Update from vllm:main and fix conflicts by @DefTruth in #23
  • [Kernel][VIP] opt cuda merge_attn_states kernel thread block dispatch by @DefTruth in #24

Full Changelog: v0.8.4rc1...v0.8.4rc2

v0.8.4rc1

08 Apr 12:53
88fef9d

Choose a tag to compare

What's Changed

  • [Kernel][VIP] support cuda merge_attn_states kernel, max ~3x improved by @DefTruth in #18
  • [Kernel][VIP] support cuda merge_attn_states kernel by @DefTruth in #19
  • [Kernel][VIP] dispatch merge_attn_states cuda kernel, half&bf16 by @DefTruth in #20
  • [Misc][VIP] Revert to original Tritron merge_attn_states kernel by @DefTruth in #21

Full Changelog: v0.8.4rc0...v0.8.4rc1

v0.8.4rc0

07 Apr 04:15
273cd8b

Choose a tag to compare