If change head_dim from 128 to 256 here
and run
pytest flashinfer/tests/attention/test_trtllm_gen_attention.py::test_trtllm_batch_decode
will see 756 failed tests.
FlashInfer version
uv pip show flashinfer-python
Name: flashinfer-python
Version: 0.4.1
Context: this head_dim from Qwen3-Next model.
If change
head_dimfrom 128 to 256 hereflashinfer/tests/attention/test_trtllm_gen_attention.py
Line 645 in ef687e9
and run
will see 756 failed tests.
FlashInfer version
Context: this
head_dimfrom Qwen3-Next model.