Skip to content

Commit ffa04bb

Browse files
PerkzZhengclaude
andcommitted
Update trtllm-gen FMHA cubins to fix context SWA page-skip
Update TRTLLM_GEN_FMHA artifact path and checksum to pick up cubins built from trtllm-gen with the context SWA page-skip fix. The new cubins skip loading out-of-window KV pages in context (prefill) kernels, preventing NaN corruption from null blocks in the KV cache. Fixes: https://nvbugspro.nvidia.com/bug/5922676 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent a99ee72 commit ffa04bb

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

flashinfer/artifacts.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ class ArtifactPath:
135135
When compiling new cubins for backend directories, update the corresponding path.
136136
"""
137137

138-
TRTLLM_GEN_FMHA: str = "55bba55929d4093682e32d817bd11ffb0441c749/fmha/trtllm-gen/"
138+
TRTLLM_GEN_FMHA: str = "84ee55b93ba92b1f64bbf727fd1cf38d9d3058a7/fmha/trtllm-gen/"
139139
TRTLLM_GEN_BMM: str = (
140140
"39a9d28268f43475a757d5700af135e1e58c9849/batched_gemm-5ee61af-2b9855b/"
141141
)
@@ -155,7 +155,7 @@ class CheckSumHash:
155155
"""
156156

157157
TRTLLM_GEN_FMHA: str = (
158-
"f2c0aad1e74391c4267a2f9a20ec819358b59e04588385cffb452ed341500b99"
158+
"99add183a75a55a0aa77c6f61fa58cd6a2b40709effbdf8527e0a21588cdc7c9"
159159
)
160160
TRTLLM_GEN_BMM: str = (
161161
"db06db7f36a2a9395a2041ff6ac016fe664874074413a2ed90797f91ef17e0f6"

0 commit comments

Comments
 (0)