Skip to content

Bump mlir-air to 9377b0e and triton_shared to e0c5133#67

Merged
erwei-xilinx merged 3 commits into
amd:mainfrom
erwei-xilinx:bump-mlir-air-9377b0e
Jun 3, 2026
Merged

Bump mlir-air to 9377b0e and triton_shared to e0c5133#67
erwei-xilinx merged 3 commits into
amd:mainfrom
erwei-xilinx:bump-mlir-air-9377b0e

Conversation

@erwei-xilinx

Copy link
Copy Markdown
Collaborator

Summary

  • Bump utils/mlir-air-hash.txt from dfa6d08 (May 8) to 9377b0e (Jun 3). Picks up Xilinx/mlir-air#1645 ("reset memtile counter per bucket-shape group"), which restores i8 matmul compilation speed for the Triton-XDNA-generated broadcast pattern. Also pulls in Path B (#1609) and Stage C RFC #1567 changes.
  • Bump third_party/triton_shared submodule from c043a85 to e0c5133 (upstream main, "Fix compiler warnings"). triton_shared.patch regenerated with line-number drift only in PtrAnalysis.cpp.
  • Add explicit result type annotations to transform.air.linalg_promote and transform.air.fuse_into_containing_op invocations in _get_transform_ir_string() (the fallback transform script in driver.py). Required by Path B's migration of transform ops from !pdl.operation to !transform.any_op (Xilinx/mlir-air@719bc653). Without this, every kernel that doesn't supply its own AIR_TRANSFORM_TILING_SCRIPT fails to parse.

Verified

Single 2048×1024×1024 cold compile of matmul_i8_m128_n64_k64:

Before (dfa6d08) Bump w/o #1645 (21dd121) This PR (9377b0e)
cold compile 9.5 s 145 s 8.9 s
36-invocation sweep passes timeout @ 1200 s 152 s

Test plan

  • python scripts/run_tests.py --device aie2p --timeout 600 → 17 pass, 0 timeouts, 1 skip, 1 fail (matvec — unrelated WIP example)
  • matmul_i8_m128_n64_k64 full sweep completes in ~150 s (was timing out)
  • CI on hardware runner

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings June 3, 2026 04:17
Picks up Xilinx/mlir-air#1645 ("reset memtile counter per bucket-shape
group"), which restores i8 matmul compilation speed for the
Triton-XDNA-generated broadcast pattern. matmul_i8_m128_n64_k64 cold
compile drops from ~145s back to ~9s, full sweep from a 1200s timeout
back to ~150s. Also pulls in the Path B follow-up (#1609) and Stage C
RFC #1567 changes.

The Path B work in mlir-air migrated transform ops from !pdl.operation
to !transform.any_op, requiring explicit result type annotations on
transform.air.linalg_promote and transform.air.fuse_into_containing_op
in the fallback transform script used by kernels that do not supply
AIR_TRANSFORM_TILING_SCRIPT.

triton_shared bumps from c043a85 to e0c5133 (upstream main, "Fix
compiler warnings"); the patch is regenerated with no semantic change
(line-number drift only in PtrAnalysis.cpp).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates pinned upstream dependencies (mlir-air and triton_shared) and adjusts Triton-XDNA’s fallback AIR transform script to remain compatible with recent mlir-air Transform dialect type changes.

Changes:

  • Bump mlir-air pin to commit 9377b0e (Jun 3) to pick up compilation-speed fixes and Path B / Stage C changes.
  • Bump third_party/triton_shared submodule and regenerate third_party/triton_shared.patch with line-number drift.
  • Update the default transform IR string in amd_triton_npu/backend/driver.py by adding explicit type annotations to certain transform ops.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
utils/mlir-air-hash.txt Updates the pinned mlir-air commit/timestamp used by setup/install scripts.
third_party/triton_shared.patch Refreshes patch metadata to match the updated triton_shared submodule context.
amd_triton_npu/backend/driver.py Updates fallback Transform dialect IR to align with mlir-air’s handle type migration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread amd_triton_npu/backend/driver.py
Comment thread amd_triton_npu/backend/driver.py
Comment thread amd_triton_npu/backend/driver.py
erwei-xilinx and others added 2 commits June 2, 2026 21:31
The mlir-air 9377b0e wheel pins mlir_aie_no_rtti (the renamed package
published to the latest-wheels-no-rtti-2 release channel). The old
latest-wheels-no-rtti channel only carries the legacy mlir_aie name and
no longer receives new builds, so install resolves to "No matching
distribution" without the -2 link.

Updates env_setup.sh, env_setup.ps1, build.yml, nightly-wheels.yml,
README.md, and pyproject.toml to point at latest-wheels-no-rtti-2.
build.yml's pip show / install-dir extraction also switches to the new
mlir_aie_no_rtti distribution name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bumped triton_shared (e0c5133) has a switch in UseInfo::meetUseType
that covers all UseType enum values and returns from each case. MSVC's
flow analysis doesn't recognize this as exhaustive and emits C4715
"not all control paths return a value", which CMake's /WX (warning as
error) on the Windows wheel build promotes to a fatal error. GCC
doesn't care, so Linux wheels build fine.

Append llvm_unreachable after the switch. Folded into
third_party/triton_shared.patch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@erwei-xilinx erwei-xilinx merged commit da45abd into amd:main Jun 3, 2026
12 of 13 checks passed
@erwei-xilinx erwei-xilinx deleted the bump-mlir-air-9377b0e branch June 3, 2026 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants