Skip to content

Validate incompatible cache_modifier/eviction_policy combinations in NVIDIA backend#10003

Open
swjng wants to merge 1 commit intotriton-lang:mainfrom
swjng:fix/cache-modifier-validation
Open

Validate incompatible cache_modifier/eviction_policy combinations in NVIDIA backend#10003
swjng wants to merge 1 commit intotriton-lang:mainfrom
swjng:fix/cache-modifier-validation

Conversation

@swjng
Copy link
Copy Markdown

@swjng swjng commented Apr 11, 2026

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
  • Select one of the following.

    • The lit tests I have added follow these best practices, including the "tests should be minimal" section.

Why

When tl.load/tl.store is called with a PTX-illegal combination of cache_modifier and eviction_policy, Triton emits PTX that ptxas then rejects with an opaque assembler error:

ptxas error: Modifier '.evict_first' cannot be combined with modifier '.cs'

Users see a low-level message with no indication of which Python arguments caused it. The PTX ISA forbids combining certain cache modifiers with L1 eviction hints:

  • .cs (cache streaming) bypasses L1; L1 eviction hints are undefined.
  • .ca (cache all, load) overrides L1 eviction policy; hints conflict.
  • .cg (cache global) bypasses L1; evict_first is invalid.

What the fix does

Validate the combinations at PTX lowering time in third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/LoadStoreOpToLLVM.cpp (LoadOpConversion::matchAndRewrite and StoreOpConversion::matchAndRewrite). On an illegal combination, emit a clear op.emitOpError before generating PTX.

The check lives in the NVIDIA backend, not the backend-agnostic semantic.py, so the Triton frontend remains neutral to PTX ISA constraints.

Combinations covered:

op cache_modifier eviction_policy
store .cs evict_first, evict_last
store .cg evict_first
load .ca evict_first, evict_last
load .cg evict_first

Example error after this fix:

error: cache_modifier '.cs' is incompatible with eviction_policy
'evict_first'/'evict_last': .cs bypasses L1 cache

Test plan

Added lit test cases to test/Conversion/tritongpu_to_llvm.mlir alongside the existing store_with_cache_attr / load_with_l2_cache_hint / store_with_l2_cache_hint tests. --verify-diagnostics was added to the existing RUN line so each illegal combination is checked via expected-error. Existing valid-path tests continue to pass unchanged.

@swjng swjng requested a review from ptillet as a code owner April 11, 2026 09:56
@swjng swjng changed the title Raise error for incompatible cache_modifier and eviction_policy Raise error for incompatible cache_modifier and eviction_policy Apr 11, 2026
@swjng swjng force-pushed the fix/cache-modifier-validation branch from 264e82f to a13cfd0 Compare April 11, 2026 10:11
Comment thread python/triton/language/semantic.py Outdated

def load(self, ptr: TensorTy, mask: Optional[TensorTy], other: Optional[TensorTy], boundary_check: Tuple,
padding_option: str, cache_modifier: str, eviction_policy: str, is_volatile: bool) -> TensorTy:
# Validate PTX-incompatible combinations before emitting IR.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load is generic across different backends and not always lowered to PTX

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Moved the validation out of semantic.py into LoadStoreOpToLLVM.cpp in the NVIDIA PTX codegen pass, right before the PTX instruction builder runs. The check now lives at the correct layer — it only fires for the NVIDIA backend and is expressed as op.emitOpError() returning failure(). Updated the tests to expect triton.CompilationError instead of ValueError.

@Jokeren
Copy link
Copy Markdown
Contributor

Jokeren commented Apr 12, 2026

  1. In this case it should be an MLIR test
  2. Please update your PR description as well

Copy link
Copy Markdown
Collaborator

@ThomasRaoux ThomasRaoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In triton we should not error in the compiler from a valid kernel

@swjng swjng force-pushed the fix/cache-modifier-validation branch 3 times, most recently from 13ae582 to 8b4f25c Compare April 13, 2026 03:58
…NVIDIA backend

When tl.load/tl.store is called with a PTX-illegal combination of
cache_modifier and eviction_policy, Triton previously emitted PTX
containing both modifiers and let ptxas fail with an opaque assembler
error:

    ptxas error: Modifier '.evict_first' cannot be combined with modifier '.cs'

Users saw a low-level message with no indication of which Python
arguments caused it.

Add validation in LoadStoreOpToLLVM.cpp (NVIDIA-specific PTX lowering)
that emits a clear compilation error before any PTX is generated.
Placing the check in the NVIDIA backend, not in backend-agnostic
semantic.py, keeps the frontend neutral to PTX ISA constraints.

PTX-illegal combinations covered:

| op    | cache_modifier | eviction_policy              |
|-------|----------------|------------------------------|
| store | .cs            | evict_first, evict_last      |
| store | .cg            | evict_first                  |
| load  | .ca            | evict_first, evict_last      |
| load  | .cg            | evict_first                  |
@swjng swjng force-pushed the fix/cache-modifier-validation branch from 8b4f25c to 174ce21 Compare April 13, 2026 04:01
@swjng swjng changed the title Raise error for incompatible cache_modifier and eviction_policy Validate incompatible cache_modifier/eviction_policy combinations in NVIDIA backend Apr 13, 2026
@swjng
Copy link
Copy Markdown
Author

swjng commented Apr 13, 2026

@Jokeren — moved the test to test/Conversion/tritongpu_to_llvm.mlir as a lit test with expected-error, and updated the title/description.

@ThomasRaoux — on reflection the kernel is valid at the Triton level (both .cs and evict_first are documented tl.store arguments); the conflict is purely a PTX ISA restriction, which isn't something Triton can or should fix.

One alternative before closing: the NVIDIA backend could silently drop one of the two hints (e.g. keep .cs, drop the eviction hint) and log a warning, so the kernel still compiles. Do you see value in that, or should I just close this and let ptxas surface the error?

@swjng swjng requested review from Jokeren and ThomasRaoux April 13, 2026 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants