Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
44bd2ad
[Path B 1/7] Switch allocation_info_t + DMAAllocator base API to Tile…
erwei-xilinx May 9, 2026
7d4ef93
[Path B 2/7] Make allocateLockOp + ShimDMAAllocator::getBuffer TileLi…
erwei-xilinx May 9, 2026
5dc1c18
[Path B 3/7] AIRMergeUnrolledDevices: merge LogicalTileOps too
erwei-xilinx May 9, 2026
1e6a826
[Path B 4/7] AIR emits logical shim/memtiles end-to-end
erwei-xilinx May 9, 2026
ead2782
[Path B 5/7] aircc: invoke aie-place-tiles after air-merge-unrolled-d…
erwei-xilinx May 9, 2026
06dc5d2
[Path B 6/7] Lit test migration: chain --aie-place-tiles in RUN lines
erwei-xilinx May 9, 2026
509a15b
[Path B 7/7] Lit test migration: CHECK-DAG for tile/buffer/lock listings
erwei-xilinx May 9, 2026
82cf89d
[Path B] clang-format-17 fixes from CI
erwei-xilinx May 11, 2026
1738dac
[Path B] Group shim DMAs onto same LTO; reserve lock IDs across LTO c…
erwei-xilinx May 11, 2026
a6d5f06
[Path B] XFAIL the 13 AIRToAIE tests pending Path B CHECK migration
erwei-xilinx May 11, 2026
643e7f2
[Path B] Hint shim col + add aie-place-tiles to xrt/05_extern_func
erwei-xilinx May 11, 2026
899d66b
[Path B] Revert shim col-hint — broke wider NPU1 capacity check
erwei-xilinx May 11, 2026
cbda1ab
[Path B] Restore baseline 1-shim-per-compute-col placement
erwei-xilinx May 11, 2026
83c5cc5
[Path B] Bump mlir-aie pin to 8125c33 (latest wheel)
erwei-xilinx May 11, 2026
b7b809b
Revert "[Path B] XFAIL the 13 AIRToAIE tests pending Path B CHECK mig…
erwei-xilinx May 12, 2026
49b7d60
[Path B] Migrate 11 AIRToAIE lit CHECKs to placer-driven output
erwei-xilinx May 12, 2026
a7d6fad
[Path B] AIRToAIE tests: check LTO output, not placer output
erwei-xilinx May 12, 2026
d682b08
[Path B] AIRToAIE tests: drop --aie-place-tiles, check LTO output
erwei-xilinx May 12, 2026
260673c
[Path B] Fix objfifo dominance bug: hoist tile-likes before objfifo
erwei-xilinx May 12, 2026
745571f
[Path B] objfifo: stop resolving shim LTOs in AIR; defer to aie-place…
erwei-xilinx May 12, 2026
49d9559
[Path B] aircc: drop place-tiles from aieModule; only place on npuModule
erwei-xilinx May 12, 2026
4659271
[Path B] Place once, in aiecc only — make airrt-to-npu LTO-aware
erwei-xilinx May 12, 2026
25f46b4
[Path B] ShimDMAAllocator: restore pre-Path-B (col, channel) rotation
erwei-xilinx May 12, 2026
ac1b8b5
[Path B] ShimDMAAllocator: scope packet-flow reuse to same-col LTOs
erwei-xilinx May 12, 2026
fb90106
[Path B] allocateLockOp: scope ID reservation to same-col LTOs
erwei-xilinx May 12, 2026
7b46620
[Path B] AIR emits unhinted LTOs; defer placement to aie-place-tiles
erwei-xilinx May 12, 2026
e6a6b26
[Path B] Re-bump mlir-aie pin to 886d932 (includes mlir-aie #3068)
erwei-xilinx May 21, 2026
3e4242f
[Path B] AIRToAIE tests: migrate 17 CHECK drifters to placer-driven LTOs
erwei-xilinx May 21, 2026
368a233
[Path B] ShimDMAAllocator: bucket by far-side LTO when col is unknown
erwei-xilinx May 21, 2026
8c38255
[Path B] ShimDMAAllocator: spread L3-direct broadcasts across shim LTOs
erwei-xilinx May 21, 2026
de2c35a
[Path B] ShimDMAAllocator: order shim LTOs by their target memtile
erwei-xilinx May 21, 2026
7a856c6
[Path B] L3 shim allocation: process flows in rigidity-decreasing order
erwei-xilinx May 21, 2026
4c62855
[Path B] allocation_info_t: add getDmaTileOp/getDmaTileValue accessors
erwei-xilinx May 21, 2026
ce38924
Revert "[Path B] allocation_info_t: add getDmaTileOp/getDmaTileValue …
erwei-xilinx May 21, 2026
29d9e2e
[Path B] ShimDMAAllocator: extract collectDmaIds() helper
erwei-xilinx May 22, 2026
a18adb2
[Path B] AIRRtToNpuPass: extract isShimTileValue() helper
erwei-xilinx May 22, 2026
b04a71a
[Path B] ShimDMAAllocator: use llvm::concat over (mm2s, s2mm) allocs
erwei-xilinx May 22, 2026
b7cbcd9
[Path B] Use Block::getOps<OpT>() for op-type-filtered walks
erwei-xilinx May 22, 2026
9106c49
[Path B] collectDmaIds: use llvm::map_range over manual loop
erwei-xilinx May 22, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 51 additions & 42 deletions mlir/include/air/Conversion/AIRToAIESchedulingUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,31 +28,7 @@ AIE::TileOp getPhysTileOpOrNull(AIE::DeviceOp aie_device, int col, int row);
// get tileop using physical coordinates
AIE::TileOp getPhysTileOp(AIE::DeviceOp aie_device, int col, int row);

// Materialize a physical aie.tile by emitting an aie.logical_tile<tileType>
// with the given hints (use std::nullopt for "?"), running mlir-aie's
// SequentialPlacer, and resolving the result through getPhysTileOp. On
// placement failure, emits a diagnostic on `aie_device` and returns failure.
//
// Caller must NOT be inside a greedy PatternRewriter callback; this helper
// uses plain OpBuilder + replaceAllUsesWith/erase, which would invalidate
// a greedy worklist's cached use-def edges (see RFC #1567 milestone 2).
mlir::FailureOr<AIE::TileOp> createTileViaPlacer(AIE::DeviceOp aie_device,
AIE::AIETileType tileType,
std::optional<int> col_hint,
std::optional<int> row_hint);

// Batched variant: emits N aie.logical_tile<tileType> ops (one per hint),
// runs the placer ONCE, and resolves each into a physical aie.tile. The
// returned vector parallels `hints`. Use this when multiple unconstrained
// or partially-constrained logical tiles must be placed together — e.g.,
// a herd of cores all asking (col, ?), which a per-tile placer would all
// map to the same row because state doesn't persist across place() calls.
mlir::LogicalResult createTilesViaPlacer(
AIE::DeviceOp aie_device, AIE::AIETileType tileType,
llvm::ArrayRef<std::pair<std::optional<int>, std::optional<int>>> hints,
llvm::SmallVectorImpl<AIE::TileOp> &outTiles);

AIE::LockOp allocateLockOp(AIE::DeviceOp aie_device, AIE::TileOp tile,
AIE::LockOp allocateLockOp(AIE::DeviceOp aie_device, AIE::TileLike tile,
int init = 0, int id = -1,
StringAttr name = nullptr);

Expand Down Expand Up @@ -91,32 +67,55 @@ getLockValuePair(const AIE::AIETargetModel &targetModel, Value buffer_memref,
air::ChannelOp air_chan);

struct allocation_info_t {
AIE::TileOp dma_tile = nullptr;
// dma_tile is the SSA value of the (logical or physical) AIE tile that owns
// this DMA allocation. Stored as TileLike (op interface) so it works for
// both AIE::TileOp (post-placement) and AIE::LogicalTileOp (pre-placement).
// Pointer-equality on the underlying Operation* gives the same answer as
// (col, row) integer comparison without dependence on physical placement.
AIE::TileLike dma_tile = nullptr;
int64_t col = -1;
int64_t row = -1;
AIE::DMAChannel dma_channel = {AIE::DMAChannelDir::MM2S, -1};
int64_t tile_channel = -1;
int packet_flow_id = -1; // Packet flow ID assigned during flow creation
// The other-side LTO (Operation*) of the flow this allocation belongs to.
// For a shim allocation, this is the memtile (or compute-core) LTO at the
// far end of the flow; for tile/memtile allocations it is unused. Used as
// the shim DMA bucket key so that one shim LTO never bundles flows whose
// far-side LTOs differ — keying on TileLike Operation* identity is lossless
// even when the far-side LTO is unplaced and its col is unknown (Path B,
// RFC #1567). Pre-Path-B the bucket keyed on `col`, which was a lossless
// proxy because each LTO had a unique col; with unhinted LTOs every flow
// collapsed to col=-1 and one shim LTO swallowed every memtile-side flow.
Operation *otherSideLTO = nullptr;
std::vector<int32_t> dma_id;
std::vector<Operation *> memcpyOps;
bool valid();
AIE::TileOp getDmaTile();
bool foundAlloc(AIE::TileOp tile);
bool foundAlloc(AIE::TileOp tile, air::MemcpyInterface memcpyOp);
bool foundAlloc(AIE::TileOp tile, air::ChannelOp channel_op);
bool foundAlloc(AIE::TileOp tile, AIE::DMAChannel channel);
bool foundPacketFlowAllocInTile(AIE::TileOp tile);
AIE::TileLike getDmaTile();
bool foundAlloc(AIE::TileLike tile);
bool foundAlloc(AIE::TileLike tile, air::MemcpyInterface memcpyOp);
bool foundAlloc(AIE::TileLike tile, air::ChannelOp channel_op);
bool foundAlloc(AIE::TileLike tile, AIE::DMAChannel channel);
bool foundPacketFlowAllocInTile(AIE::TileLike tile);

bool foundAlloc(air::ChannelOp channel_op);
bool foundAlloc(AIE::DMAChannel channel);

// Column-keyed; row is implied (shim is always row 0).
// Column-keyed; row is implied (shim is always row 0). Returns false for
// unplaced tiles (tryGetCol() == nullopt) — column-keyed lookups are only
// meaningful when the tile has a known column.
bool foundAllocInColumn(int32_t col);
bool foundAllocInColumn(int32_t col, AIE::DMAChannel channel);
bool foundPacketFlowAllocInColumn(int32_t col);

bool operator==(const allocation_info_t &other) const {
return dma_tile == other.dma_tile && col == other.col && row == other.row &&
// op interface getOperation() isn't const-qualified; cast away the
// top-level const for the pointer-equality comparison.
auto thisOp =
const_cast<allocation_info_t *>(this)->dma_tile.getOperation();
auto otherOp =
const_cast<allocation_info_t &>(other).dma_tile.getOperation();
return thisOp == otherOp && col == other.col && row == other.row &&
dma_channel == other.dma_channel &&
tile_channel == other.tile_channel;
}
Expand Down Expand Up @@ -154,13 +153,13 @@ class DMAAllocator {
: device(device), dmaMemorySpace(dmaMemorySpace) {}

FailureOr<allocation_info_t>
lookupDMAAllocation(AIE::TileOp tile, air::MemcpyInterface &memcpyOp);
lookupDMAAllocation(AIE::TileLike tile, air::MemcpyInterface &memcpyOp);
FailureOr<std::pair<AIE::LockOp, AIE::LockOp>>
getLockForDMA(air::MemcpyInterface &memcpyOp, AIE::TileOp tile,
getLockForDMA(air::MemcpyInterface &memcpyOp, AIE::TileLike tile,
Operation *bufferOp, bool lockRaceConditionFix = false);
FailureOr<allocation_info_t>
allocNewDmaChannel(air::MemcpyInterface &memcpyOp, AIE::TileOp tile, int chan,
int col, int row, std::vector<int> dma_id);
allocNewDmaChannel(air::MemcpyInterface &memcpyOp, AIE::TileLike tile,
int chan, int col, int row, std::vector<int> dma_id);
void sortMemcpyOps(std::vector<Operation *> dma_memcpy_ops);

protected:
Expand Down Expand Up @@ -194,15 +193,25 @@ class TileDMAAllocator : public DMAAllocator {
class ShimDMAAllocator : public DMAAllocator {

public:
std::vector<int> dma_columns;
// Per-shim DMA channel count (2 MM2S + 2 S2MM on all current targets).
// Caps how many channels AIR may pack onto one shim LTO before opening
// a new LTO; aie-place-tiles (with merge-ltos=false) then maps each LTO
// to its own physical shim col.
int shim_dma_channels;

ShimDMAAllocator(AIE::DeviceOp device);

// Allocate a new shim DMA channel. The shim tile is emitted as an
// unconstrained aie.logical_tile<ShimNOCTile>(?, ?). aie-place-tiles
// assigns the physical column from flow adjacency to placed core peers.
// `otherSide` is the LTO (or physical tile) at the OTHER end of the flow
// (memtile or core); its Operation* identity is the bucket key used to
// group shim allocations so flows targeting distinct far-side LTOs land
// on distinct shim LTOs. col/row are kept for airrt metadata only and
// may be -1 when otherSide is an unhinted LTO.
FailureOr<allocation_info_t>
allocNewDmaChannel(air::MemcpyInterface &memcpyOp, int col, int row,
std::vector<Operation *> &dma_ops,
std::string colAllocConstraint = "same_column");
allocNewDmaChannel(air::MemcpyInterface &memcpyOp, AIE::TileLike otherSide,
int col, int row, std::vector<Operation *> &dma_ops);

FailureOr<allocation_info_t>
allocNewDmaChannel(air::MemcpyInterface &memcpyOp,
Expand Down
30 changes: 20 additions & 10 deletions mlir/lib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,25 @@ add_subdirectory(Util)
get_property(dialect_libs GLOBAL PROPERTY MLIR_DIALECT_LIBS)
get_property(conversion_libs GLOBAL PROPERTY MLIR_CONVERSION_LIBS)

set(_air_initall_link_libs
AIRConversionPasses
AIRTransformPasses
AIRTransformOps
AIRDialect
AIRRtDialect
AIRUtil
AIRInterface
MLIRSupport
${conversion_libs}
${dialect_libs})

if(AIR_ENABLE_AIE)
# AIETransforms exposes registerAIEPasses() — wired into
# registerAllPasses() so air-opt and aircc can invoke aie-place-tiles
# on the LogicalTileOps emitted by AIR's lowering.
list(APPEND _air_initall_link_libs AIETransforms)
endif()

add_mlir_library(
AIRInitAll
InitAll.cpp
Expand All @@ -26,13 +45,4 @@ add_mlir_library(
AIRInterface

LINK_LIBS
AIRConversionPasses
AIRTransformPasses
AIRTransformOps
AIRDialect
AIRRtDialect
AIRUtil
AIRInterface
MLIRSupport
${conversion_libs}
${dialect_libs})
${_air_initall_link_libs})
57 changes: 41 additions & 16 deletions mlir/lib/Conversion/AIRRtToNpuPass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,38 @@

using namespace mlir;

// Path B: airrt-to-npu runs before aie-place-tiles (which now lives only in
// aiecc). Read the shim col from either a physical aie.tile or, if the
// shim hasn't been placed yet, the col hint on aie.logical_tile<...>(col,?).
// AIR sets that hint to the compute-side col so the placer's hint-respecting
// behavior gives the same physical col here as it will downstream.
// Returns -1 if neither is available.
static int getColFromTileValue(mlir::Value tile) {
if (!tile)
return -1;
mlir::Operation *def = tile.getDefiningOp();
if (auto t = llvm::dyn_cast_or_null<xilinx::AIE::TileOp>(def))
return t.getCol();
if (auto lto = llvm::dyn_cast_or_null<xilinx::AIE::LogicalTileOp>(def))
if (auto col = lto.tryGetCol())
return *col;
return -1;
}

// True if `tile` is a shim tile defining op. Accepts either a physical
// aie.tile or an unplaced aie.logical_tile<ShimNOCTile|ShimPLTile>.
static bool isShimTileValue(mlir::Value tile) {
if (!tile)
return false;
mlir::Operation *def = tile.getDefiningOp();
if (auto t = llvm::dyn_cast_or_null<xilinx::AIE::TileOp>(def))
return t.isShimTile();
if (auto lto = llvm::dyn_cast_or_null<xilinx::AIE::LogicalTileOp>(def))
return lto.getTileType() == xilinx::AIE::AIETileType::ShimNOCTile ||
lto.getTileType() == xilinx::AIE::AIETileType::ShimPLTile;
return false;
}

// Helper function to check if an aie.device contains core/memtile DMAs with
// repeat_count > 0. This indicates that the DMA engine state needs to be reset
// after each launch to avoid stale repeat counters affecting the next launch.
Expand Down Expand Up @@ -1940,8 +1972,7 @@ struct AIRRtToNpuPass : public impl::AIRRtToNpuBase<AIRRtToNpuPass> {
auto objFifo = device.lookupSymbol<AIE::ObjectFifoCreateOp>(metadata);
if (objFifo) {
for (auto consumerTileOp : objFifo.getConsumerTiles()) {
auto consTileOp = consumerTileOp.getDefiningOp<AIE::TileOp>();
if (consTileOp && consTileOp.isShimTile()) {
if (isShimTileValue(consumerTileOp)) {
isS2MM = true;
break;
}
Expand Down Expand Up @@ -2031,17 +2062,16 @@ struct AIRRtToNpuPass : public impl::AIRRtToNpuBase<AIRRtToNpuPass> {
// within THIS device only
DenseMap<StringRef, StringRef> uniqueAllocMap;
for (auto alloc : allocs) {
AIE::TileOp shimtile = alloc.getTileOp();
std::tuple<bool, int, int> allocInfo = {
alloc.getChannelDir() == AIE::DMAChannelDir::MM2S,
alloc.getChannelIndex(), shimtile.getCol()};
alloc.getChannelIndex(), getColFromTileValue(alloc.getTile())};

auto it =
llvm::find_if(uniqueAllocs, [&](AIE::ShimDMAAllocationOp ualloc) {
AIE::TileOp shimtile = ualloc.getTileOp();
std::tuple<bool, int, int> uallocInfo = {
ualloc.getChannelDir() == AIE::DMAChannelDir::MM2S,
ualloc.getChannelIndex(), shimtile.getCol()};
ualloc.getChannelIndex(),
getColFromTileValue(ualloc.getTile())};
return allocInfo == uallocInfo;
});
if (it != uniqueAllocs.end()) {
Expand Down Expand Up @@ -2482,20 +2512,15 @@ struct AIRRtToNpuPass : public impl::AIRRtToNpuBase<AIRRtToNpuPass> {
if (d) {
if (auto infoOp = AIE::ShimDMAAllocationOp::getForSymbol(
d, dma.getMetadata().getRootReference())) {
AIE::TileOp shimtile = infoOp.getTileOp();
col = shimtile.getCol();
col = getColFromTileValue(infoOp.getTile());
} else if (auto objFifoCreateOp = getObjectFifoCreateOpForSymbol(
objectFifoCreateOps,
dma.getMetadata().getLeafReference().getValue())) {
auto prodTileOp =
objFifoCreateOp->getProducerTile().getDefiningOp<AIE::TileOp>();
if (prodTileOp.isShimTile())
col = prodTileOp.colIndex();
if (isShimTileValue(objFifoCreateOp->getProducerTile()))
col = getColFromTileValue(objFifoCreateOp->getProducerTile());
for (auto consumerTileOp : objFifoCreateOp->getConsumerTiles()) {
auto consTileOp = consumerTileOp.getDefiningOp<AIE::TileOp>();
if (consTileOp.isShimTile()) {
col = consTileOp.colIndex();
}
if (isShimTileValue(consumerTileOp))
col = getColFromTileValue(consumerTileOp);
}
}
}
Expand Down
Loading
Loading