Skip to content

Commit 3062d6f

Browse files
erwei-xilinxclaude
andcommitted
[Path B] Restore baseline 1-shim-per-compute-col placement
CI Triton 8x4 routing failure root cause: Path B's centroid-driven shim placement put 6 shim tiles clustered at cols 0-5, leaving compute cols 6-7 with no nearby shim. mlir-aie's pathfinder then can't find a legal route through the network. Baseline (pre-Xilinx#1605, pre-Path-B) deterministically produced 8 shim cols (one per active compute col) via the same-column heuristic, which routed cleanly. Fix has three pieces; this commit lands the AIR side and bumps the mlir-aie pin to pull in the third: 1. AIR (this commit): emit shim LTOs as `aie.logical_tile<ShimNOCTile>( compute_col, ?)` whenever the device has a ShimNOC tile at that col. On AIE1 (sparse ShimNOC at cols 2/6/10) the hint stays unset and the placer falls back to centroid placement, preserving existing behavior. 2. AIR (this commit): scope LTO grouping to same-col candidates. Without this, the first shim allocation creates an LTO and all subsequent allocations reuse it regardless of compute col, so the per-col hint is never honored. Now allocations only group onto an LTO whose col hint matches their compute col. 3. mlir-aie #3064 (already merged at 45915e4): extend `findTileWithCapacity` from sweep-right-only to bidirectional sweep. Bumps utils/clone-mlir-aie.sh from b37dc33 to 45915e4 to pick this up. Verified locally: Path B now produces bit-identical placement to baseline trunk for the failing Triton 8x4 workload — 48 unique tiles, 8 shim cols at 0-7, 8 memtile cols at 0-7, 32 compute cores at rows 2-5. Lit suite: 370/372 pass (only 2 pre-existing AIRToROCDL failures unrelated to Path B). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent cfe6e63 commit 3062d6f

2 files changed

Lines changed: 23 additions & 4 deletions

File tree

mlir/lib/Conversion/AIRToAIESchedulingUtils.cpp

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1068,13 +1068,27 @@ air::ShimDMAAllocator::allocNewDmaChannel(air::MemcpyInterface &memcpyOp,
10681068
return c;
10691069
return -1;
10701070
};
1071+
// Only reuse an existing LTO if its col hint matches `col` (the
1072+
// compute-side column). This preserves baseline's "1 shim per active
1073+
// compute col" placement under the LTO model: each compute col gets
1074+
// its own shim LTO (with `(col, ?)` hint), so the placer + bidirectional
1075+
// sweep (mlir-aie #3064) can spread shims under each compute col rather
1076+
// than clustering near the centroid.
10711077
for (auto *side : {&mm2s_allocs, &s2mm_allocs}) {
10721078
for (auto &t : *side) {
10731079
auto cand = dyn_cast<AIE::LogicalTileOp>(t.dma_tile.getOperation());
10741080
if (!cand)
10751081
continue;
10761082
if (cand.getTileType() != AIE::AIETileType::ShimNOCTile)
10771083
continue;
1084+
auto candCol = cand.getCol();
1085+
if (col >= 0) {
1086+
if (!candCol || (int)*candCol != col)
1087+
continue;
1088+
} else {
1089+
if (candCol)
1090+
continue;
1091+
}
10781092
int c = pickChannelForLTO(cand);
10791093
if (c < 0)
10801094
continue;
@@ -1094,9 +1108,14 @@ air::ShimDMAAllocator::allocNewDmaChannel(air::MemcpyInterface &memcpyOp,
10941108
else
10951109
break;
10961110
}
1111+
auto *ctx = b.getContext();
1112+
const auto &tm = device.getTargetModel();
1113+
IntegerAttr colAttr =
1114+
(col >= 0 && col < tm.columns() && tm.isShimNOCTile(col, 0))
1115+
? IntegerAttr::get(IntegerType::get(ctx, 32), col)
1116+
: IntegerAttr();
10971117
tileLT = AIE::LogicalTileOp::create(b, device.getLoc(),
1098-
AIE::AIETileType::ShimNOCTile,
1099-
/*col=*/IntegerAttr(),
1118+
AIE::AIETileType::ShimNOCTile, colAttr,
11001119
/*row=*/IntegerAttr(),
11011120
/*allocation_scheme=*/StringAttr());
11021121
dma_channel = 0;

utils/clone-mlir-aie.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@
1414
#
1515
##===----------------------------------------------------------------------===##
1616

17-
export HASH=b37dc33d41511684fd4eef1b8ac2e3f74fd5f169
18-
DATETIME=2026050821
17+
export HASH=45915e410804c1859f7fffa3a3369485970577e8
18+
DATETIME=2026051117
1919
WHEEL_VERSION=0.0.1.$DATETIME+${HASH:0:7}
2020

2121
if [ x"$1" == x--get-wheel-version ]; then

0 commit comments

Comments
 (0)