Skip to content

Commit 5c2418f

Browse files
erwei-xilinxclaude
andauthored
Implement RegionBranchOpInterface on air hierarchy ops (#1566)
* Implement RegionBranchOpInterface on air hierarchy ops air.rank, air.launch, air.segment, and air.herd now implement RegionBranchOpInterface so that upstream MLIR analyses (alias analysis, dataflow, dead-code elimination, ...) can walk the parent-operand → body-block-arg mapping that IsolatedFromAbove already constrains explicitly. The mapping is the same one HierarchyInterface already exposes via getKernelOperands / getKernelArguments / getTiedKernelOperand, but re-stated through the upstream interface so generic infrastructure finds it. With this in place, mlir::LocalAliasAnalysis can now trace through hierarchy block args, which removes the need for the custom hierarchy walk in CanonicalizeAsyncOpDeps's getRoot. Three caveats addressed: - AIRDependency.cpp dispatched generic RegionBranchOpInterface ops through scf.if/affine.if-only handling. Skip air hierarchy ops there — they have their own async-token machinery. - HerdOp::getKernelArguments hardcoded drop_front(4), assuming 2D. The new auto-verifier on RegionBranchOpInterface exercises this path on the existing 0-dim-herd test, exposing the latent bug. Replaced with drop_front(getNumDims() * 2) to match the other three ops. - The body→parent successor yields no values; the async token result is a launch side-effect, not a yielded value, so getEntrySuccessorOperands and getSuccessorInputs return empty for that direction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Advertise hierarchy invocation bounds via getRegionInvocationBounds air.launch / air.segment / air.herd / air.rank now implement the RegionBranchOpInterface::getRegionInvocationBounds method. The body executes once per (x, y, ...) coordinate of the iteration space; if every size operand folds to a non-negative IntegerAttr we report a tight {n, n} bound, otherwise Unknown. This matches scf.forall's convention (each lattice point invokes the body once, in parallel) and is needed so that upstream passes keyed on RegionBranchOpInterface (ControlFlowSink, the inliner's repetition heuristics, future LICM-style work) do not assume hierarchy bodies are non-iterative just because there is no body->body edge in getSuccessorRegions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Use upstream LocalAliasAnalysis in CanonicalizeAsyncOpDeps Two changes that compose: 1. air.execute now implements MemoryEffectOpInterface. The body's effects are mirrored to the outer op (so RAW/WAR/WAW analyses do not assume the execute is side-effect free), and Allocate effects on yielded values are retargeted to the corresponding outer op result. With this in place upstream alias analyses can reason that two distinct `air.execute -> memref` ops produce non-aliasing storage, which previously needed the bespoke SSA-identity getRoot walk to recognise. 2. CanonicalizeAsyncOpDeps' mayAlias predicate now consults mlir::LocalAliasAnalysis first and falls back to the existing getRoot SSA-identity check only when alias() returns MayAlias. The fallback is still needed because LocalAliasAnalysis is intentionally conservative for two distinct entry-block func args (a caller could in principle pass the same buffer twice, and MLIR has no `noalias` annotation on func args by default), while AIR's domain assumption is that distinct top-level memref SSA values address distinct storage. Net effect: strictly >= the previous precision. Cases LocalAliasAnalysis can decide pick up the upstream analysis (subviews-of-distinct-allocs, hierarchy-arg traversal, future passes that improve LocalAliasAnalysis); cases it cannot decide fall through to the AIR-domain heuristic. This is the followup hinted at in the previous commit's PR description. The interface unlock would also let new passes use upstream alias analysis directly, without re-implementing getRoot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address Copilot review - computeHierarchyBounds: pre-multiply overflow guard so a single oversized size operand cannot wrap uint64_t and silently report a small invocation bound. - ExecuteOp::getEffects: when an inner op does not implement MemoryEffectOpInterface, consult mlir::isMemoryEffectFree before falling back to the conservative Read+Write+Free. arith.constant / affine.apply / similar Pure-trait ops are common in execute bodies and were pessimising the declared effects without it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Trim verbose comments Comments now name only the WHY when it isn't obvious from the code. PR-description-style narration (history, alternatives, rationale chains) removed from the source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 911ec57 commit 5c2418f

4 files changed

Lines changed: 220 additions & 42 deletions

File tree

mlir/include/air/Dialect/AIR/AIR.td

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ def air_RankOp : air_Op<"rank", [air_AsyncOpInterface,
3737
AttrSizedOperandSegments,
3838
IsolatedFromAbove,
3939
AffineScope,
40+
DeclareOpInterfaceMethods<RegionBranchOpInterface, ["getEntrySuccessorOperands", "getSuccessorInputs", "getRegionInvocationBounds"]>,
4041
SingleBlockImplicitTerminator<"RankTerminatorOp">]>,
4142
Arguments<(ins OptionalAttr<SymbolNameAttr>:$sym_name,
4243
Variadic<air_AsyncToken>:$async_dependencies,
@@ -142,6 +143,7 @@ def air_LaunchOp : air_Op<"launch", [air_AsyncOpInterface,
142143
AttrSizedOperandSegments,
143144
IsolatedFromAbove,
144145
AffineScope,
146+
DeclareOpInterfaceMethods<RegionBranchOpInterface, ["getEntrySuccessorOperands", "getSuccessorInputs", "getRegionInvocationBounds"]>,
145147
SingleBlockImplicitTerminator<"LaunchTerminatorOp">]>,
146148
Arguments<(ins OptionalAttr<SymbolNameAttr>:$sym_name,
147149
Variadic<air_AsyncToken>:$async_dependencies,
@@ -231,6 +233,7 @@ def air_SegmentOp : air_Op<"segment", [air_AsyncOpInterface,
231233
AttrSizedOperandSegments,
232234
IsolatedFromAbove,
233235
AffineScope,
236+
DeclareOpInterfaceMethods<RegionBranchOpInterface, ["getEntrySuccessorOperands", "getSuccessorInputs", "getRegionInvocationBounds"]>,
234237
SingleBlockImplicitTerminator<"SegmentTerminatorOp">]>,
235238
Arguments<(ins OptionalAttr<SymbolNameAttr>:$sym_name,
236239
Variadic<air_AsyncToken>:$async_dependencies,
@@ -351,6 +354,7 @@ def air_HerdOp : air_Op<"herd", [air_AsyncOpInterface,
351354
AttrSizedOperandSegments,
352355
IsolatedFromAbove,
353356
AffineScope,
357+
DeclareOpInterfaceMethods<RegionBranchOpInterface, ["getEntrySuccessorOperands", "getSuccessorInputs", "getRegionInvocationBounds"]>,
354358
SingleBlockImplicitTerminator<"HerdTerminatorOp">]>,
355359
Arguments<(ins OptionalAttr<SymbolNameAttr>:$sym_name,
356360
OptionalAttr<StrAttr>:$link_with,
@@ -805,7 +809,8 @@ def air_ChannelGetOp : air_Op<"channel.get", [air_AsyncOpInterface,
805809
// AIR asynchronous region for dynamic event dispatching.
806810

807811
def air_ExecuteOp : air_Op<"execute", [SingleBlockImplicitTerminator<"ExecuteTerminatorOp">,
808-
air_AsyncOpInterface]> {
812+
air_AsyncOpInterface,
813+
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>]> {
809814
let arguments = (
810815
ins Variadic<air_AsyncToken>:$async_dependencies
811816
);

mlir/lib/Dialect/AIR/IR/AIRDialect.cpp

Lines changed: 206 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99
#include "air/Dialect/AIR/AIRDialect.h"
1010

11+
#include "mlir/Analysis/AliasAnalysis/LocalAliasAnalysis.h"
1112
#include "mlir/Dialect/Affine/IR/AffineOps.h"
1213
#include "mlir/Dialect/Arith/IR/Arith.h"
1314
#include "mlir/Dialect/Linalg/IR/Linalg.h"
@@ -20,12 +21,14 @@
2021
#include "mlir/IR/IRMapping.h"
2122
#include "mlir/IR/Iterators.h"
2223
#include "mlir/IR/PatternMatch.h"
24+
#include "mlir/Interfaces/ControlFlowInterfaces.h"
2325
#include "mlir/Interfaces/LoopLikeInterface.h"
2426
#include "mlir/Transforms/RegionUtils.h"
2527

2628
#include "llvm/ADT/TypeSwitch.h"
2729

2830
#include <iostream>
31+
#include <limits>
2932

3033
using namespace mlir;
3134

@@ -233,24 +236,11 @@ static void printAsyncDependencies(OpAsmPrinter &printer, Operation *op,
233236
template <class OpT>
234237
static LogicalResult CanonicalizeAsyncOpDeps(OpT op,
235238
PatternRewriter &rewriter) {
236-
// Walk view-like ops, air.hierarchy body args, and loop iter_args back
237-
// to a root memref. The collectors below resolve every memref through
238-
// this walk before inserting, so the RAW/WAR/WAW comparisons reduce to
239-
// set-membership.
240-
//
241-
// Intentionally conservative — DO NOT TIGHTEN without a strong reason:
242-
// * Two disjoint views of the same root return the same root and
243-
// are reported as aliasing. Modeling subview offsets/sizes to
244-
// prove disjointness was the bug behind #1559: the original code
245-
// compared SSA identity (the strictest possible "alias" predicate)
246-
// and dropped real RAW edges between fill and channel.put.
247-
// * iter_args resolve to the loop init; a body that conditionally
248-
// yields a different memref will under-approximate roots seen
249-
// across iterations, but still over-approximates aliasing.
250-
//
251-
// Over-approximating aliasing is the safe direction here: this
252-
// predicate gates dead-edge *removal*, so false positives only
253-
// suppress an optimization, never invent a race.
239+
// Fallback for the AIR-domain assumption that distinct top-level memref
240+
// SSA values address distinct storage; upstream alias analysis cannot
241+
// make that claim because MLIR has no noalias on func args. Aliasing
242+
// here gates dead-edge removal, so over-approximation only suppresses
243+
// an optimization (#1559).
254244
auto getRoot = [](Value v) -> Value {
255245
while (true) {
256246
if (auto view = v.getDefiningOp<ViewLikeOpInterface>()) {
@@ -275,6 +265,12 @@ static LogicalResult CanonicalizeAsyncOpDeps(OpT op,
275265
return v;
276266
}
277267
};
268+
mlir::LocalAliasAnalysis aliasAnalysis;
269+
auto mayAlias = [&aliasAnalysis, &getRoot](Value a, Value b) {
270+
if (aliasAnalysis.alias(a, b).isNo())
271+
return false;
272+
return getRoot(a) == getRoot(b);
273+
};
278274
auto getAllReadAccess = [](Operation *op) {
279275
SmallVector<Value> operands;
280276
if (auto linalgop = dyn_cast_if_present<linalg::LinalgOp>(op)) {
@@ -320,18 +316,15 @@ static LogicalResult CanonicalizeAsyncOpDeps(OpT op,
320316
}
321317
return operands;
322318
};
323-
// Root memrefs accessed by `o` under `accessFn`. With `walkConsumers=true`
324-
// (sink side), also unions in accesses from `o`'s transitive async-token
325-
// consumers — required for sync primitives whose own accesses don't
326-
// overlap a real barrier dep (issue #1559). Must be `false` on the source
327-
// side, or the source would inherit its own sink's accesses.
319+
// walkConsumers=true unions accesses from `o`'s transitive async-token
320+
// consumers (required for sync primitives, #1559). Must be false on the
321+
// source side or the source inherits its own sink's accesses.
328322
auto getAllMemrefsAccessedByOp =
329-
[getRoot](Operation *o, bool walkConsumers,
330-
llvm::function_ref<SmallVector<Value>(Operation *)> accessFn) {
323+
[](Operation *o, bool walkConsumers,
324+
llvm::function_ref<SmallVector<Value>(Operation *)> accessFn) {
331325
llvm::SetVector<Value> memrefs;
332-
auto insertRoot = [&](Value v) { memrefs.insert(getRoot(v)); };
333326
for (Value v : accessFn(o))
334-
insertRoot(v);
327+
memrefs.insert(v);
335328
SmallVector<Region *> regions;
336329
for (auto &region : o->getRegions())
337330
regions.push_back(&region);
@@ -340,15 +333,15 @@ static LogicalResult CanonicalizeAsyncOpDeps(OpT op,
340333
air::walkAsyncTokenConsumers(o, consumers);
341334
for (auto user : consumers) {
342335
for (Value v : accessFn(user))
343-
insertRoot(v);
336+
memrefs.insert(v);
344337
for (auto &region : user->getRegions())
345338
regions.push_back(&region);
346339
}
347340
}
348341
for (auto region : regions) {
349342
visitUsedValuesDefinedAbove(*region, [&](OpOperand *use) {
350343
if (llvm::is_contained(accessFn(use->getOwner()), use->get()))
351-
insertRoot(use->get());
344+
memrefs.insert(use->get());
352345
});
353346
}
354347
return memrefs;
@@ -447,19 +440,20 @@ static LogicalResult CanonicalizeAsyncOpDeps(OpT op,
447440
llvm::range_size(llvm::concat<Value>(memrefsReadBySinkOp,
448441
memrefsWrittenBySinkOp)) != 0;
449442
if (sourceOpTouchesMemref && sinkOpTouchesMemref) {
450-
// Sets already contain root memrefs (resolved by the collectors),
451-
// so RAW/WAR/WAW checks reduce to plain set-membership.
443+
auto anyAlias = [&](const llvm::SetVector<Value> &as,
444+
const llvm::SetVector<Value> &bs) {
445+
for (Value a : as)
446+
for (Value b : bs)
447+
if (mayAlias(a, b))
448+
return true;
449+
return false;
450+
};
452451
bool RAWNotFound =
453-
llvm::none_of(memrefsWrittenBySourceOp, [&](Value v) {
454-
return memrefsReadBySinkOp.contains(v);
455-
});
456-
bool WARNotFound = llvm::none_of(memrefsReadBySourceOp, [&](Value v) {
457-
return memrefsWrittenBySinkOp.contains(v);
458-
});
452+
!anyAlias(memrefsWrittenBySourceOp, memrefsReadBySinkOp);
453+
bool WARNotFound =
454+
!anyAlias(memrefsReadBySourceOp, memrefsWrittenBySinkOp);
459455
bool WAWNotFound =
460-
llvm::none_of(memrefsWrittenBySourceOp, [&](Value v) {
461-
return memrefsWrittenBySinkOp.contains(v);
462-
});
456+
!anyAlias(memrefsWrittenBySourceOp, memrefsWrittenBySinkOp);
463457
bool noSharedResource = llvm::none_of(
464458
resourcesUsedBySourceOp, [&resourcesUsedBySinkOp](SymbolRefAttr r) {
465459
return llvm::is_contained(resourcesUsedBySinkOp, r);
@@ -845,6 +839,53 @@ unsigned air::LaunchOp::getNumDims() {
845839
return segment_sizes[1];
846840
}
847841

842+
// Hierarchy body invokes once per iteration-space coordinate — product of
843+
// the size operands when all are constant, Unknown otherwise.
844+
static InvocationBounds computeHierarchyBounds(ArrayRef<Attribute> operands,
845+
unsigned sizeStart,
846+
unsigned numDims) {
847+
constexpr uint64_t kMaxUnsigned = std::numeric_limits<unsigned>::max();
848+
uint64_t product = 1;
849+
for (unsigned i = 0; i < numDims; ++i) {
850+
auto intAttr = dyn_cast_if_present<IntegerAttr>(operands[sizeStart + i]);
851+
if (!intAttr)
852+
return InvocationBounds::getUnknown();
853+
int64_t v = intAttr.getInt();
854+
if (v < 0)
855+
return InvocationBounds::getUnknown();
856+
uint64_t uv = static_cast<uint64_t>(v);
857+
if (uv != 0 && product > kMaxUnsigned / uv)
858+
return InvocationBounds::getUnknown();
859+
product *= uv;
860+
}
861+
unsigned n = static_cast<unsigned>(product);
862+
return InvocationBounds(n, n);
863+
}
864+
865+
OperandRange air::LaunchOp::getEntrySuccessorOperands(RegionSuccessor succ) {
866+
if (succ.getSuccessor() == &getBody())
867+
return getKernelOperands();
868+
auto end = (*this)->operand_end();
869+
return OperandRange(end, end);
870+
}
871+
ValueRange air::LaunchOp::getSuccessorInputs(RegionSuccessor succ) {
872+
if (succ.getSuccessor() == &getBody())
873+
return ValueRange(getKernelArguments());
874+
return {};
875+
}
876+
void air::LaunchOp::getSuccessorRegions(
877+
RegionBranchPoint point, SmallVectorImpl<RegionSuccessor> &regions) {
878+
if (point.isParent())
879+
regions.push_back(RegionSuccessor(&getBody()));
880+
else
881+
regions.push_back(RegionSuccessor::parent());
882+
}
883+
void air::LaunchOp::getRegionInvocationBounds(
884+
ArrayRef<Attribute> operands, SmallVectorImpl<InvocationBounds> &bounds) {
885+
bounds.push_back(computeHierarchyBounds(
886+
operands, getAsyncDependencies().size(), getNumDims()));
887+
}
888+
848889
//
849890
// RankOp
850891
//
@@ -1223,6 +1264,30 @@ unsigned air::RankOp::getNumDims() {
12231264
return segment_sizes[2];
12241265
}
12251266

1267+
OperandRange air::RankOp::getEntrySuccessorOperands(RegionSuccessor succ) {
1268+
if (succ.getSuccessor() == &getBody())
1269+
return getKernelOperands();
1270+
auto end = (*this)->operand_end();
1271+
return OperandRange(end, end);
1272+
}
1273+
ValueRange air::RankOp::getSuccessorInputs(RegionSuccessor succ) {
1274+
if (succ.getSuccessor() == &getBody())
1275+
return ValueRange(getKernelArguments());
1276+
return {};
1277+
}
1278+
void air::RankOp::getSuccessorRegions(
1279+
RegionBranchPoint point, SmallVectorImpl<RegionSuccessor> &regions) {
1280+
if (point.isParent())
1281+
regions.push_back(RegionSuccessor(&getBody()));
1282+
else
1283+
regions.push_back(RegionSuccessor::parent());
1284+
}
1285+
void air::RankOp::getRegionInvocationBounds(
1286+
ArrayRef<Attribute> operands, SmallVectorImpl<InvocationBounds> &bounds) {
1287+
unsigned sizeStart = getAsyncDependencies().size() + (getUniverse() ? 1 : 0);
1288+
bounds.push_back(computeHierarchyBounds(operands, sizeStart, getNumDims()));
1289+
}
1290+
12261291
LogicalResult air::RankOp::verify() {
12271292
// RankOp may be nested inside air.launch (for multi-GPU parallelism),
12281293
// but not inside air.segment, air.herd, or another air.rank.
@@ -1500,6 +1565,30 @@ unsigned air::SegmentOp::getNumDims() {
15001565
return segment_sizes[1];
15011566
}
15021567

1568+
OperandRange air::SegmentOp::getEntrySuccessorOperands(RegionSuccessor succ) {
1569+
if (succ.getSuccessor() == &getBody())
1570+
return getKernelOperands();
1571+
auto end = (*this)->operand_end();
1572+
return OperandRange(end, end);
1573+
}
1574+
ValueRange air::SegmentOp::getSuccessorInputs(RegionSuccessor succ) {
1575+
if (succ.getSuccessor() == &getBody())
1576+
return ValueRange(getKernelArguments());
1577+
return {};
1578+
}
1579+
void air::SegmentOp::getSuccessorRegions(
1580+
RegionBranchPoint point, SmallVectorImpl<RegionSuccessor> &regions) {
1581+
if (point.isParent())
1582+
regions.push_back(RegionSuccessor(&getBody()));
1583+
else
1584+
regions.push_back(RegionSuccessor::parent());
1585+
}
1586+
void air::SegmentOp::getRegionInvocationBounds(
1587+
ArrayRef<Attribute> operands, SmallVectorImpl<InvocationBounds> &bounds) {
1588+
bounds.push_back(computeHierarchyBounds(
1589+
operands, getAsyncDependencies().size(), getNumDims()));
1590+
}
1591+
15031592
/// Utility function to verify that all memref.alloc operations within a region
15041593
/// have a memory space at least as local as the specified minimum.
15051594
/// For example, minMemorySpace=L2 means allocations must be L2 or L1;
@@ -1856,7 +1945,7 @@ Value air::HerdOp::getKernelOperand(unsigned i) {
18561945
}
18571946

18581947
ArrayRef<BlockArgument> air::HerdOp::getKernelArguments() {
1859-
return getBody().front().getArguments().drop_front(4);
1948+
return getBody().front().getArguments().drop_front(getNumDims() * 2);
18601949
}
18611950

18621951
BlockArgument air::HerdOp::getKernelArgument(unsigned i) {
@@ -1870,6 +1959,30 @@ unsigned air::HerdOp::getNumDims() {
18701959
return segment_sizes[1];
18711960
}
18721961

1962+
OperandRange air::HerdOp::getEntrySuccessorOperands(RegionSuccessor succ) {
1963+
if (succ.getSuccessor() == &getBody())
1964+
return getKernelOperands();
1965+
auto end = (*this)->operand_end();
1966+
return OperandRange(end, end);
1967+
}
1968+
ValueRange air::HerdOp::getSuccessorInputs(RegionSuccessor succ) {
1969+
if (succ.getSuccessor() == &getBody())
1970+
return ValueRange(getKernelArguments());
1971+
return {};
1972+
}
1973+
void air::HerdOp::getSuccessorRegions(
1974+
RegionBranchPoint point, SmallVectorImpl<RegionSuccessor> &regions) {
1975+
if (point.isParent())
1976+
regions.push_back(RegionSuccessor(&getBody()));
1977+
else
1978+
regions.push_back(RegionSuccessor::parent());
1979+
}
1980+
void air::HerdOp::getRegionInvocationBounds(
1981+
ArrayRef<Attribute> operands, SmallVectorImpl<InvocationBounds> &bounds) {
1982+
bounds.push_back(computeHierarchyBounds(
1983+
operands, getAsyncDependencies().size(), getNumDims()));
1984+
}
1985+
18731986
uint64_t air::HerdOp::getNumCols() {
18741987
auto cols = getSizeOperands()[0].getDefiningOp();
18751988
return cast<arith::ConstantIndexOp>(cols).value();
@@ -1901,6 +2014,58 @@ LogicalResult air::ExecuteOp::verify() {
19012014
return success();
19022015
}
19032016

2017+
// Mirror body effects up to the execute op. Inner Allocate effects on
2018+
// yielded values get retargeted to the outer result so alias analysis
2019+
// can prove distinct execute-yielded memrefs NoAlias. Inner ops with no
2020+
// declared effects fall back to conservative R+W+Free unless upstream
2021+
// can prove them effect-free.
2022+
void air::ExecuteOp::getEffects(
2023+
SmallVectorImpl<MemoryEffects::EffectInstance> &effects) {
2024+
if (getBody().empty())
2025+
return;
2026+
llvm::SmallDenseMap<Value, OpResult> yieldToResult;
2027+
if (auto execTerm = dyn_cast_if_present<air::ExecuteTerminatorOp>(
2028+
getBody().getTerminator())) {
2029+
unsigned numYields = execTerm->getNumOperands();
2030+
unsigned firstYieldResult = getNumResults() - numYields;
2031+
for (unsigned i = 0; i < numYields; ++i)
2032+
yieldToResult[execTerm->getOperand(i)] =
2033+
cast<OpResult>(getResult(firstYieldResult + i));
2034+
}
2035+
bool sawUnknownOp = false;
2036+
getBody().walk([&](Operation *inner) {
2037+
if (inner == getBody().getTerminator())
2038+
return;
2039+
auto effectOp = dyn_cast<MemoryEffectOpInterface>(inner);
2040+
if (!effectOp) {
2041+
if (mlir::isMemoryEffectFree(inner))
2042+
return;
2043+
sawUnknownOp = true;
2044+
return;
2045+
}
2046+
SmallVector<MemoryEffects::EffectInstance> innerEffects;
2047+
effectOp.getEffects(innerEffects);
2048+
for (auto &e : innerEffects) {
2049+
if (isa<MemoryEffects::Allocate>(e.getEffect())) {
2050+
Value v = e.getValue();
2051+
auto it = v ? yieldToResult.find(v) : yieldToResult.end();
2052+
if (it != yieldToResult.end())
2053+
effects.emplace_back(e.getEffect(), it->second, e.getResource());
2054+
} else {
2055+
effects.emplace_back(e.getEffect(), e.getResource());
2056+
}
2057+
}
2058+
});
2059+
if (sawUnknownOp) {
2060+
effects.emplace_back(MemoryEffects::Read::get(),
2061+
SideEffects::DefaultResource::get());
2062+
effects.emplace_back(MemoryEffects::Write::get(),
2063+
SideEffects::DefaultResource::get());
2064+
effects.emplace_back(MemoryEffects::Free::get(),
2065+
SideEffects::DefaultResource::get());
2066+
}
2067+
}
2068+
19042069
static LogicalResult FoldExecute(air::ExecuteOp op, PatternRewriter &rewriter) {
19052070

19062071
// if the terminator is the only thing in the ExecuteOp,

0 commit comments

Comments
 (0)