Skip to content

Fix: Remove tensor pointers from autotune keys in 08-grouped-gemm tutorial#9430

Merged
peterbell10 merged 1 commit intotriton-lang:mainfrom
niyunsheng:niyunsheng-patch-1
Feb 11, 2026
Merged

Fix: Remove tensor pointers from autotune keys in 08-grouped-gemm tutorial#9430
peterbell10 merged 1 commit intotriton-lang:mainfrom
niyunsheng:niyunsheng-patch-1

Conversation

@niyunsheng
Copy link
Copy Markdown
Contributor

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because FILL THIS IN.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

Description: This PR removes tensor objects (group_a_ptrs, group_b_ptrs, group_c_ptrs) from the @triton.autotune keys in the Grouped GEMM tutorial.

Why this matters:

  1. Prevents Misleading Practices: Tutorials serve as reference templates. Passing tensors to autotune keys is a known anti-pattern. Fixing this prevents developers from propagating this flawed logic into production environments.
  2. Fixes Overhead & Memory Leaks: Using tensors as keys relies on hashing their Python object IDs. Since these IDs change across dispatches, it triggers continuous re-benchmarking (100% cache miss) and causes VRAM leaks because the autotuner cache retains strong references to the temporary tensors.

Only the scalar group_size is retained as a valid cache key.

@niyunsheng niyunsheng requested a review from ptillet as a code owner February 11, 2026 09:31
@peterbell10 peterbell10 enabled auto-merge (squash) February 11, 2026 15:29
@peterbell10 peterbell10 merged commit 311b3e4 into triton-lang:main Feb 11, 2026
9 checks passed
@niyunsheng niyunsheng deleted the niyunsheng-patch-1 branch March 16, 2026 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants