Commit 6c56ac8
Sync trtllm FMHA: mSparseMla -> mSparseAttn for new cubin struct
Minimal header changes to match the new trtllm-gen FMHA cubin MetaInfo
struct layout:
- TllmGenFmhaKernelMetaInfo: renamed mSparseMla (bool) -> mSparseAttn
(int). Callers convert to bool via `!= 0`.
- KernelParams (GPU-side struct): renamed mSparseMlaTopK -> mSparseAttnTopK
and moved immediately after mSkipSoftmaxThresholdScaleFactor to match
the layout expected by the new kernels.
The K/V dtype split (mDataTypeKv -> mDataTypeK/V) and SageAttention block
size fields present in the new struct are layout-compatible but not used,
so no code changes are needed for those -- existing references to
mDataTypeKv still compile since the cubin-supplied struct keeps that
field alongside the new mDataTypeK/V.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent ffa04bb commit 6c56ac8
3 files changed
Lines changed: 7 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
84 | | - | |
| 84 | + | |
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
364 | | - | |
| 364 | + | |
365 | 365 | | |
366 | 366 | | |
367 | 367 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
191 | 191 | | |
192 | 192 | | |
193 | 193 | | |
194 | | - | |
| 194 | + | |
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
197 | 200 | | |
198 | 201 | | |
199 | 202 | | |
200 | 203 | | |
201 | 204 | | |
202 | | - | |
203 | | - | |
204 | 205 | | |
205 | 206 | | |
206 | 207 | | |
| |||
879 | 880 | | |
880 | 881 | | |
881 | 882 | | |
882 | | - | |
| 883 | + | |
883 | 884 | | |
884 | 885 | | |
885 | 886 | | |
| |||
0 commit comments