Skip to content

feat: update Dist v3.5.1 for triton v3.5.1#15

Merged
KnowingNothing merged 6 commits intoByteDance-Seed:dist-v3.5from
KnowingNothing:dist-v3.5.1
Mar 3, 2026
Merged

feat: update Dist v3.5.1 for triton v3.5.1#15
KnowingNothing merged 6 commits intoByteDance-Seed:dist-v3.5from
KnowingNothing:dist-v3.5.1

Conversation

@KnowingNothing
Copy link
Collaborator

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because FILL THIS IN.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

atalman and others added 6 commits October 15, 2025 13:37
…pypi until we needed (triton-lang#8636)

- Bump version to 3.5.1. 
- Disable publishing to pypi step. so that we don't release it until we
are ready
Same as triton-lang#6807
…ps to unbreak sm103 support PR8045 (triton-lang#8485)

see @masahi's triton-lang#8473

---------

Co-authored-by: Masahiro Masuda <masahi129@gmail.com>
Fix double quote typo
``
__version__ = "3.5.1"
``
should be
``
__version__ = '3.5.1'
``
After:
triton-lang#8636
Patches for Triton-distributed support on upstream Triton v3.5.1:

- code_generator.py: Replace ir.builder with distributed.ir.DistributedOpBuilder;
  add visit_With handler for simt_exec_region; add extract/insert logic for
  tensor subscript operations (with tensor-only guard).
- compiler.py: Load distributed MLIR dialects into the compilation context.
- nvidia/backend/compiler.py: Replace upstream TTGPU conversion pass with
  distributed version; add distributed-to-LLVM lowering pass; load distributed
  dialects in NVIDIA backend.
- cache.py: Add null safety for module_finder.find_spec() and spec.origin
  during cache key generation.
- GetEnv.hpp: Register TRITON_DIST_SHMEM_WRAPPER in CACHE_INVALIDATING_ENV_VARS
  to avoid assertion failure.
@KnowingNothing KnowingNothing changed the title Dist v3.5.1 feat: update Dist v3.5.1 for triton v3.5.1 Mar 3, 2026
@KnowingNothing KnowingNothing merged commit 1610709 into ByteDance-Seed:dist-v3.5 Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants