Target release date: April 24th major feature: torch 2.11 upgrade - [x] torch 2.11 https://github.com/vllm-project/vllm-xpu-kernels/pull/155 - [x] ~block_size 16/32 support https://github.com/vllm-project/vllm-xpu-kernels/pull/171~ https://github.com/vllm-project/vllm-xpu-kernels/pull/308 - [x] ~softmax lse support #281~ - [x] stride check https://github.com/vllm-project/vllm-xpu-kernels/pull/293 - [x] topk topp inf handle https://github.com/vllm-project/vllm-xpu-kernels/pull/287 cc @rogerxfeng8 @wendyliu235 @yma11 @xinyu-intel @wuxun-zhang
Target release date: April 24th
major feature: torch 2.11 upgrade
block_size 16/32 support Add block_size 16/32 support for chunk prefill and fix paged decode #171[decode][attn] Enable 16/32/64xn block size for attention #308softmax lse support [CHUNK_PREFILL] enable return softmax_lse with partially template #281cc @rogerxfeng8 @wendyliu235 @yma11 @xinyu-intel @wuxun-zhang