Skip to content

Extend lightweight herd cloning to tiled unroll path in air-opt-shim-dma-bds #1558

@erwei-xilinx

Description

@erwei-xilinx

Context

PR #1535 introduces lightweight herd cloning during shim-level loop unrolling in loopUnrollFullWithAsyncTokenPreserved (mlir/lib/Util/Dependency.cpp). When unrolling loops that contain air.SegmentOp or air.HerdOp, it creates empty herd shells via OperationState and only clones channel ops + their transitive dependencies, skipping heavy compute ops (vector, arith, linalg). This avoids O(N × body_size) IR explosion.

Limitation

The lightweight unroller only fires when annotateFn is null — the non-tiled unroll path. The tiled path in AIROptimizeShimDMABDs (triggered by non-trivial shim-dma-tile-sizes like 2,2 or 4,4) uses loopUnrollByFactor with an annotateFn callback to tag unrolled iterations. This path still performs full deep-clone unrolling.

The annotateFn guard exists because loopUnrollByFactor is an upstream MLIR utility that doesn't support pluggable clone strategies. Extending lightweight cloning to this path requires either:

  1. Wrapping loopUnrollByFactor to accept a custom clone callback, or
  2. Performing a post-unroll strip (clone-then-strip) specifically for the tiled path — noting that a naive strip crashes due to cross-region use-after-free (see T002 postmortem in PR Lightweight herd cloning during shim DMA BD loop unrolling #1535 discussion)

Impact

For the default aircc invocation (tile sizes 1,1), this limitation has no effect — the shim BD pass doesn't tile or unroll. For workloads that use explicit tiling (e.g., flash attention with large trip counts and non-default tile sizes), the tiled unroll path still sees the full IR explosion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions