Skip unnecessary boundary-check masks in RewriteTensorDescriptorToPointerPass by gkmhub · Pull Request #10034 · triton-lang/triton

gkmhub · 2026-04-14T23:51:16Z

Summary

RewriteTensorDescriptorToPointerPass unconditionally generates masked loads/stores for all tensor descriptor accesses. In contrast, the block pointer path (RewriteTensorPointerPass) respects boundary_check=[] and produces unmasked loads/stores when no boundary checking is needed.

This causes backends that lower tensor descriptors to pointer arithmetic (i.e., backends without TMA hardware) to always pay the cost of masked memory operations, even for provably in-bounds accesses (~20% perf regression in our benchmarks).

This PR adds two complementary mechanisms to skip unnecessary mask generation:

Static in-bounds analysis (isStaticallyInBounds): compile-time check that verifies offset[i] + blockShape[i] <= shape[i] for all dimensions. When provably in-bounds, the pass emits unmasked tt.load/tt.store.
skip_boundary_check attribute: a new UnitAttr on tt.descriptor_load and tt.descriptor_store that frontends (e.g., Inductor) can set when they guarantee the access is in-bounds. On backends with TMA hardware, this attribute is simply ignored.

Fixes #10033

Files Changed

include/triton/Dialect/Triton/IR/TritonOps.td — add UnitAttr:$skip_boundary_check to DescriptorLoadOp and DescriptorStoreOp
lib/Dialect/Triton/Transforms/RewriteTensorDescriptorToPointer.cpp — add isStaticallyInBounds() and conditionally skip mask/other generation
python/src/ir.cc — thread skipBoundaryCheck through create_descriptor_load/create_descriptor_store
python/triton/language/core.py — add skip_boundary_check parameter to tensor_descriptor_base.load() and .store()
python/triton/language/semantic.py — thread parameter through descriptor_load/descriptor_store
python/triton/runtime/interpreter.py — accept (and ignore) the new parameter
test/Triton/tensor-descriptors-in-bounds.mlir — 10 LIT test cases covering in-bounds, out-of-bounds, and skip_boundary_check scenarios

Test Plan

New MLIR LIT test (tensor-descriptors-in-bounds.mlir) with 10 cases:
- In-bounds 2D load/store with zero offset
- In-bounds 2D load with nonzero offset
- Out-of-bounds load with dynamic shape/offset
- Out-of-bounds load where offset+block exceeds shape
- In-bounds 1D load
- Out-of-bounds store with dynamic shape
- skip_boundary_check attribute on load/store with dynamic shapes
Backward compatible: existing code without skip_boundary_check behaves identically
TMA backends unaffected (attribute is ignored)

…ptorToPointerPass Summary: CONTEXT: RewriteTensorDescriptorToPointerPass unconditionally generates masked loads/stores for all tensor descriptor accesses, even when the access is provably in-bounds. This causes ~20% perf regression vs the block pointer path for backends without TMA hardware. WHAT: Add two mechanisms to skip unnecessary mask generation: 1. Static in-bounds analysis (isStaticallyInBounds) that checks offset[i] + blockShape[i] <= shape[i] at compile time. 2. skip_boundary_check UnitAttr on descriptor_load/descriptor_store that frontends can set when they guarantee in-bounds access. Fixes triton-lang#10033

ThomasRaoux · 2026-04-15T00:32:55Z

If the semantic of descriptor ops (which is meant to map to TMA kind of HW) is a problem one alternative is to mimic what is currently in the front end for block_ptr and get array indexing support purely in SW: https://github.com/triton-lang/triton/blob/main/python/triton/language/core.py#L1743. We transitioned some kernels using this solution and it was fairly simple.

gkmhub requested a review from ptillet as a code owner April 14, 2026 23:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip unnecessary boundary-check masks in RewriteTensorDescriptorToPointerPass#10034

Skip unnecessary boundary-check masks in RewriteTensorDescriptorToPointerPass#10034
gkmhub wants to merge 1 commit intotriton-lang:mainfrom
gkmhub:skip-boundary-check-descriptor-rewrite

gkmhub commented Apr 14, 2026

Uh oh!

ThomasRaoux commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gkmhub commented Apr 14, 2026

Summary

Files Changed

Test Plan

Uh oh!

ThomasRaoux commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants