Commit 6684293
authored
Fix maskSpanAffineOffset bitmask in ldmatrix/stmatrix subslice check (#10066)
The subslice safety check in lowerLdStMatrix uses a bitmask to verify
that affine offsets don't touch the contiguous part of the tile's offset
dimension. It was using getOutDimSizeLog2 (which returns log2 of the
size) instead of getOutDimSize (the actual size) to construct this mask.
For outDimSize=8: log2(8)-1 = 2 (0b010) only checks bit 1, whereas 8-1 =
7 (0b111) correctly checks all bits within the tile span.
The bug makes the check too permissive — it could allow subslices that
overlap the contiguous tile region. Latent because the specific bit
patterns in maskSpanAffineOffset rarely trigger the difference.
Fix both the NVIDIA (Utility.cpp) and AMD (MemoryOpToLLVM.cpp) backends.
# New contributor declaration
- [x] I am not making a trivial change, such as fixing a typo in a
comment.
- [x] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`.
- [x] This PR does not need a test because existing tests pass, and the
fix only makes the safety check stricter .
- [x] I have not added any `lit` tests.1 parent cdac714 commit 6684293
2 files changed
Lines changed: 2 additions & 2 deletions
File tree
- third_party
- amd/lib/TritonAMDGPUToLLVM
- nvidia/lib/TritonNVIDIAGPUToLLVM
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
187 | 187 | | |
188 | 188 | | |
189 | 189 | | |
190 | | - | |
| 190 | + | |
191 | 191 | | |
192 | 192 | | |
193 | 193 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
332 | 332 | | |
333 | 333 | | |
334 | 334 | | |
335 | | - | |
| 335 | + | |
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
| |||
0 commit comments