Commit f5f26c4
Skip runtime loop unrolling in air-opt-shim-dma-bds for all-1 tile sizes
When shim-dma-tile-sizes is empty or all-1 (the default aircc path with
--air-runtime-loop-tiling-sizes=1,1), skip the tiling, unrolling, and BD
folding for runtime loops inside dummyLaunch ops with non-trivial trip
counts. The launch is still converted to scf.for + dummyLaunch (needed for
affine symbol validity), and the scf.for loops are preserved through
air-to-std and unrolled later in airrt-to-npu after
removeDeadDeviceComputeOps strips the heavy segment/herd bodies.
BD folding is skipped because AIRUnrollScfForIntoBDChain would otherwise
unroll the runtime loops, defeating the optimization. The channel ops
already have valid wraps/strides from earlier passes (air-dma-to-channel).
The fast path only applies when:
- Tile sizes are empty or all-1 (no useful tiling to perform)
- The scf.for loops are inside a dummyLaunch (from launch conversion)
- The loops have trip count > 1 (trivial loops still use normal BD folding)
Loops directly in functions (not from launch conversion) are unaffected.
The air.launch_end barrier depends on scf.for result tokens and any
top-level channel ops, ensuring proper async dependency tracking.
Profiling on flash attention (12 heads, 1024 LQ/LK, NPU1):
air-opt-shim-dma-bds: 1,892 ms -> 28 ms (67x faster)
Total MLIR passes: 3,432 ms -> 664 ms (5.2x faster)
Total aircc: 6,400 ms -> 3,581 ms (1.8x faster)
IR size after pass: 2,922 KB -> 226 KB (13x smaller)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 221ae39 commit f5f26c4
4 files changed
Lines changed: 176 additions & 0 deletions
File tree
- mlir
- lib
- Conversion
- Transform
- Util
- test/Transform/AIRDependencyScheduleOpt
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2162 | 2162 | | |
2163 | 2163 | | |
2164 | 2164 | | |
| 2165 | + | |
| 2166 | + | |
| 2167 | + | |
| 2168 | + | |
| 2169 | + | |
| 2170 | + | |
| 2171 | + | |
| 2172 | + | |
| 2173 | + | |
| 2174 | + | |
| 2175 | + | |
| 2176 | + | |
| 2177 | + | |
| 2178 | + | |
| 2179 | + | |
| 2180 | + | |
| 2181 | + | |
| 2182 | + | |
| 2183 | + | |
2165 | 2184 | | |
2166 | 2185 | | |
2167 | 2186 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2219 | 2219 | | |
2220 | 2220 | | |
2221 | 2221 | | |
| 2222 | + | |
| 2223 | + | |
| 2224 | + | |
| 2225 | + | |
2222 | 2226 | | |
2223 | 2227 | | |
2224 | 2228 | | |
| |||
6142 | 6146 | | |
6143 | 6147 | | |
6144 | 6148 | | |
| 6149 | + | |
| 6150 | + | |
| 6151 | + | |
6145 | 6152 | | |
6146 | 6153 | | |
6147 | 6154 | | |
| |||
6235 | 6242 | | |
6236 | 6243 | | |
6237 | 6244 | | |
| 6245 | + | |
| 6246 | + | |
| 6247 | + | |
| 6248 | + | |
| 6249 | + | |
| 6250 | + | |
| 6251 | + | |
| 6252 | + | |
| 6253 | + | |
| 6254 | + | |
| 6255 | + | |
| 6256 | + | |
| 6257 | + | |
| 6258 | + | |
| 6259 | + | |
| 6260 | + | |
| 6261 | + | |
6238 | 6262 | | |
6239 | 6263 | | |
6240 | 6264 | | |
| |||
6254 | 6278 | | |
6255 | 6279 | | |
6256 | 6280 | | |
| 6281 | + | |
| 6282 | + | |
| 6283 | + | |
| 6284 | + | |
| 6285 | + | |
| 6286 | + | |
| 6287 | + | |
| 6288 | + | |
| 6289 | + | |
| 6290 | + | |
| 6291 | + | |
| 6292 | + | |
| 6293 | + | |
| 6294 | + | |
| 6295 | + | |
| 6296 | + | |
| 6297 | + | |
| 6298 | + | |
| 6299 | + | |
| 6300 | + | |
| 6301 | + | |
| 6302 | + | |
| 6303 | + | |
| 6304 | + | |
| 6305 | + | |
| 6306 | + | |
| 6307 | + | |
| 6308 | + | |
| 6309 | + | |
| 6310 | + | |
| 6311 | + | |
| 6312 | + | |
| 6313 | + | |
| 6314 | + | |
| 6315 | + | |
| 6316 | + | |
| 6317 | + | |
| 6318 | + | |
| 6319 | + | |
| 6320 | + | |
| 6321 | + | |
| 6322 | + | |
| 6323 | + | |
| 6324 | + | |
| 6325 | + | |
| 6326 | + | |
| 6327 | + | |
| 6328 | + | |
| 6329 | + | |
| 6330 | + | |
| 6331 | + | |
6257 | 6332 | | |
6258 | 6333 | | |
6259 | 6334 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
801 | 801 | | |
802 | 802 | | |
803 | 803 | | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
804 | 807 | | |
805 | 808 | | |
806 | 809 | | |
| |||
Lines changed: 79 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
0 commit comments