Bump mlir-air to 9377b0e and triton_shared to e0c5133#67
Merged
Conversation
Picks up Xilinx/mlir-air#1645 ("reset memtile counter per bucket-shape group"), which restores i8 matmul compilation speed for the Triton-XDNA-generated broadcast pattern. matmul_i8_m128_n64_k64 cold compile drops from ~145s back to ~9s, full sweep from a 1200s timeout back to ~150s. Also pulls in the Path B follow-up (#1609) and Stage C RFC #1567 changes. The Path B work in mlir-air migrated transform ops from !pdl.operation to !transform.any_op, requiring explicit result type annotations on transform.air.linalg_promote and transform.air.fuse_into_containing_op in the fallback transform script used by kernels that do not supply AIR_TRANSFORM_TILING_SCRIPT. triton_shared bumps from c043a85 to e0c5133 (upstream main, "Fix compiler warnings"); the patch is regenerated with no semantic change (line-number drift only in PtrAnalysis.cpp). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates pinned upstream dependencies (mlir-air and triton_shared) and adjusts Triton-XDNA’s fallback AIR transform script to remain compatible with recent mlir-air Transform dialect type changes.
Changes:
- Bump
mlir-airpin to commit9377b0e(Jun 3) to pick up compilation-speed fixes and Path B / Stage C changes. - Bump
third_party/triton_sharedsubmodule and regeneratethird_party/triton_shared.patchwith line-number drift. - Update the default transform IR string in
amd_triton_npu/backend/driver.pyby adding explicit type annotations to certain transform ops.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
utils/mlir-air-hash.txt |
Updates the pinned mlir-air commit/timestamp used by setup/install scripts. |
third_party/triton_shared.patch |
Refreshes patch metadata to match the updated triton_shared submodule context. |
amd_triton_npu/backend/driver.py |
Updates fallback Transform dialect IR to align with mlir-air’s handle type migration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The mlir-air 9377b0e wheel pins mlir_aie_no_rtti (the renamed package published to the latest-wheels-no-rtti-2 release channel). The old latest-wheels-no-rtti channel only carries the legacy mlir_aie name and no longer receives new builds, so install resolves to "No matching distribution" without the -2 link. Updates env_setup.sh, env_setup.ps1, build.yml, nightly-wheels.yml, README.md, and pyproject.toml to point at latest-wheels-no-rtti-2. build.yml's pip show / install-dir extraction also switches to the new mlir_aie_no_rtti distribution name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bumped triton_shared (e0c5133) has a switch in UseInfo::meetUseType that covers all UseType enum values and returns from each case. MSVC's flow analysis doesn't recognize this as exhaustive and emits C4715 "not all control paths return a value", which CMake's /WX (warning as error) on the Windows wheel build promotes to a fatal error. GCC doesn't care, so Linux wheels build fine. Append llvm_unreachable after the switch. Folded into third_party/triton_shared.patch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
utils/mlir-air-hash.txtfromdfa6d08(May 8) to9377b0e(Jun 3). Picks up Xilinx/mlir-air#1645 ("reset memtile counter per bucket-shape group"), which restores i8 matmul compilation speed for the Triton-XDNA-generated broadcast pattern. Also pulls in Path B (#1609) and Stage C RFC #1567 changes.third_party/triton_sharedsubmodule fromc043a85toe0c5133(upstream main, "Fix compiler warnings").triton_shared.patchregenerated with line-number drift only inPtrAnalysis.cpp.transform.air.linalg_promoteandtransform.air.fuse_into_containing_opinvocations in_get_transform_ir_string()(the fallback transform script indriver.py). Required by Path B's migration of transform ops from!pdl.operationto!transform.any_op(Xilinx/mlir-air@719bc653). Without this, every kernel that doesn't supply its ownAIR_TRANSFORM_TILING_SCRIPTfails to parse.Verified
Single 2048×1024×1024 cold compile of
matmul_i8_m128_n64_k64:dfa6d08)21dd121)9377b0e)Test plan
python scripts/run_tests.py --device aie2p --timeout 600→ 17 pass, 0 timeouts, 1 skip, 1 fail (matvec— unrelated WIP example)matmul_i8_m128_n64_k64full sweep completes in ~150 s (was timing out)🤖 Generated with Claude Code