Skip to content

Commit cbda1ab

Browse files
erwei-xilinxclaude
andcommitted
[Path B] Restore baseline 1-shim-per-compute-col placement
CI Triton 8x4 routing failure root cause: Path B's centroid-driven shim placement put 6 shim tiles clustered at cols 0-5, leaving compute cols 6-7 with no nearby shim. mlir-aie's pathfinder then can't find a legal route through the network. Baseline (pre-Xilinx#1605, pre-Path-B) deterministically produced 8 shim cols (one per active compute col) via the same-column heuristic, which routed cleanly. Fix has three pieces; this commit lands the AIR side and bumps the mlir-aie pin to pull in the third: 1. AIR (this commit): emit shim LTOs as `aie.logical_tile<ShimNOCTile>( compute_col, ?)` whenever the device has a ShimNOC tile at that col. On AIE1 (sparse ShimNOC at cols 2/6/10) the hint stays unset and the placer falls back to centroid placement, preserving existing behavior. 2. AIR (this commit): scope LTO grouping to same-col candidates. Without this, the first shim allocation creates an LTO and all subsequent allocations reuse it regardless of compute col, so the per-col hint is never honored. Now allocations only group onto an LTO whose col hint matches their compute col. 3. mlir-aie #3064 (already merged at 45915e4): extend `findTileWithCapacity` from sweep-right-only to bidirectional sweep. Bumps utils/clone-mlir-aie.sh from b37dc33 to 45915e4 to pick this up. Verified locally: Path B now produces bit-identical placement to baseline trunk for the failing Triton 8x4 workload — 48 unique tiles, 8 shim cols at 0-7, 8 memtile cols at 0-7, 32 compute cores at rows 2-5. Lit suite: 370/372 pass (only 2 pre-existing AIRToROCDL failures unrelated to Path B). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 899d66b commit cbda1ab

2 files changed

Lines changed: 24 additions & 5 deletions

File tree

mlir/lib/Conversion/AIRToAIESchedulingUtils.cpp

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1065,13 +1065,27 @@ air::ShimDMAAllocator::allocNewDmaChannel(air::MemcpyInterface &memcpyOp,
10651065
return c;
10661066
return -1;
10671067
};
1068+
// Only reuse an existing LTO if its col hint matches `col` (the
1069+
// compute-side column). This preserves baseline's "1 shim per active
1070+
// compute col" placement under the LTO model: each compute col gets
1071+
// its own shim LTO (with `(col, ?)` hint), so the placer + bidirectional
1072+
// sweep (mlir-aie #3064) can spread shims under each compute col rather
1073+
// than clustering near the centroid.
10681074
for (auto *side : {&mm2s_allocs, &s2mm_allocs}) {
10691075
for (auto &t : *side) {
10701076
auto cand = dyn_cast<AIE::LogicalTileOp>(t.dma_tile.getOperation());
10711077
if (!cand)
10721078
continue;
10731079
if (cand.getTileType() != AIE::AIETileType::ShimNOCTile)
10741080
continue;
1081+
auto candCol = cand.getCol();
1082+
if (col >= 0) {
1083+
if (!candCol || (int)*candCol != col)
1084+
continue;
1085+
} else {
1086+
if (candCol)
1087+
continue;
1088+
}
10751089
int c = pickChannelForLTO(cand);
10761090
if (c < 0)
10771091
continue;
@@ -1091,9 +1105,14 @@ air::ShimDMAAllocator::allocNewDmaChannel(air::MemcpyInterface &memcpyOp,
10911105
else
10921106
break;
10931107
}
1108+
auto *ctx = b.getContext();
1109+
const auto &tm = device.getTargetModel();
1110+
IntegerAttr colAttr =
1111+
(col >= 0 && col < tm.columns() && tm.isShimNOCTile(col, 0))
1112+
? IntegerAttr::get(IntegerType::get(ctx, 32), col)
1113+
: IntegerAttr();
10941114
tileLT = AIE::LogicalTileOp::create(b, device.getLoc(),
1095-
AIE::AIETileType::ShimNOCTile,
1096-
/*col=*/IntegerAttr(),
1115+
AIE::AIETileType::ShimNOCTile, colAttr,
10971116
/*row=*/IntegerAttr(),
10981117
/*allocation_scheme=*/StringAttr());
10991118
dma_channel = 0;

utils/clone-mlir-aie.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,16 @@
1414
#
1515
##===----------------------------------------------------------------------===##
1616

17-
export HASH=886d9325f1b087d2c1180aece51d53384b698a46
18-
DATETIME=2026052005
17+
export HASH=45915e410804c1859f7fffa3a3369485970577e8
18+
DATETIME=2026051117
1919
WHEEL_VERSION=0.0.1.$DATETIME+${HASH:0:7}
2020

2121
if [ x"$1" == x--get-wheel-version ]; then
2222
echo $WHEEL_VERSION
2323
exit 0
2424
fi
2525

26-
MLIR_PYTHON_EXTRAS_SHORTHASH=a736a7d
26+
MLIR_PYTHON_EXTRAS_SHORTHASH=a6ab724
2727

2828
if [ x"$1" == x--get-mlir-python-extras-version ]; then
2929
echo $MLIR_PYTHON_EXTRAS_SHORTHASH

0 commit comments

Comments
 (0)