[AIE] ReservedRegsLICM: extend hoisting/sinking to nested loops and save/restore brackets#976
Open
andcarminati wants to merge 5 commits into
Open
[AIE] ReservedRegsLICM: extend hoisting/sinking to nested loops and save/restore brackets#976andcarminati wants to merge 5 commits into
andcarminati wants to merge 5 commits into
Conversation
Collaborator
Author
Collaborator
|
I had a competing local implementation, so I asked AI to compare the two implementations. |
In a multi-block loop with conditional branches, a candidate register may be defined on some paths to the latch but not others. On the paths that skip the definition the register is live at the latch entry, carrying a value from a previous iteration. Sinking the last def from the latch to the exit block would expose that stale value to the instructions that use the register before the def in the latch. Fix this by pre-computing LatchEntryLive — the set of registers live at the entry of the latch block — and rejecting any sink candidate whose register appears in that set. Add a test (srs_dyn_sign_conditional_no_sink) where $srssign0 is defined in one conditional branch but not the other, and the latch uses it before redefining it. The pass must leave both the COPY and the MOVX_mvx_cr_imm in place.
Some loops protect a reserved register with a save/restore bracket: vreg = COPY $reserved_reg ; save pre-loop value $reserved_reg = MOVX imm ; set loop-invariant value ... uses of $reserved_reg ... $reserved_reg = COPY vreg ; restore pre-loop value The existing hoisting and sinking passes cannot handle this pattern: the register is a loop livein (the save reads it before the set defines it), and it has two defs (set + restore), so both getUniqueDef and the livein guard block any transformation. Add findSaveRestoreBracket / processSaveRestoreBracket to detect and transform the pattern as a unit. The detection checks that: - the save is a vreg = COPY $reserved_reg with exactly one use (the restore), - no use of the register occurs between the save and the set, - the set is loop-invariant, - the restore is the last def of the register in the latch and copies exactly the vreg from the save, - the register is not used in any non-latch block of the loop (which would observe the wrong value after the set is hoisted). When the pattern matches, the save and set are moved to the preheader (in order, so the save captures the pre-loop value before the set overwrites it) and the restore is sinked to the exit block. The loop body then always sees the loop-invariant value with no per-iteration save/restore overhead. Add a test (crsat_save_restore_bracket) that verifies the transformation for $crsat.
The flat sequential backward walk in collectLoopReservedLiveins was incorrect for loops with conditional paths. When L.getBlocks() returns blocks in RPO order (header first, latch last), a def in one branch could remove a register from the live set after a sibling branch's use had already added it back. This caused the register to be incorrectly classified as not a loop livein, allowing processForExitSink to sink its definition to the exit block even though the register carried a pre-loop value on some paths. For example, in a loop where bb.2 uses $srssign0 and bb.3 defines it (both successors of the header), the old code would process bb.2's use first (adding $srssign0 to the live set) and then bb.3's def (removing it), concluding $srssign0 is not a livein. But $srssign0 IS a livein because bb.2 can execute on the first iteration without going through bb.3. Fix this by replacing the flat walk with a backward dataflow worklist algorithm that correctly propagates liveness through the CFG until a fixed point is reached. The algorithm initialises live-in sets to empty and grows them monotonically, re-queuing predecessors whenever a block's live-in set changes. Add a test case (livein_bug_no_sink) to the aie2ps reserved-reg-licm test that reproduces the bug and verifies the MOVX is not sunk.
…the latch processForExitSink was scanning L.getLoopLatch() for sink candidates. When the latch and the exiting block differ, instructions in the latch are not on the exit path: sinking them to the exit block inserts a def that was never executed on that path, corrupting the register's value at the exit. Fix this by scanning L.getExitingBlock() instead. The guard that checks whether a register is live at the entry of the scanned block is also updated to use the exiting block (renamed LatchEntryLive -> ExitingBlockEntryLive). Add a test case (latch_vs_exiting_no_sink) that reproduces the bug: a loop where the latch (bb.3) defines $srssign0 = MOVX 0 but the exiting block (bb.2) exits directly to bb.4 without going through bb.3. The MOVX must remain in bb.3.
447ee79 to
40e5a8c
Compare
Collaborator
Author
Hi, thank you very much for catching those cases. The implementation covers those cases now. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This series extends the ReservedRegsLICM pass in three ways:
Nested loop support. The pass previously only processed the innermost
loop. It now iterates over all loops in the nest (innermost first via
reverse preorder), allowing loop-invariant control register assignments
to be hoisted or sinked out of outer loops as well.
Correctness fix for multi-block loops. processForExitSink could
incorrectly sink a register's last def to the exit block in loops where
the register is defined in one conditional branch but not another. On
the path that skips the def the register is live at the latch entry,
carrying a value from a predecessor; sinking the def would expose the
wrong value on that path. A LatchEntryLive pre-computation (walking the
latch backward) guards sinking with !LatchEntryLive.contains(reg).
Save/restore bracket transformation. Some loops protect a reserved
register with a save/restore bracket:
vreg = COPY $reserved_reg ; save pre-loop value
$reserved_reg = MOVX imm ; set loop-invariant value
... uses of $reserved_reg ...
$reserved_reg = COPY vreg ; restore pre-loop value
The existing hoisting/sinking cannot handle this because the register
is a loop livein (the save reads it) and has two defs (set + restore).
The new findSaveRestoreBracket / processSaveRestoreBracket functions
detect and transform the pattern as a unit: save+set are moved to the
preheader (in order, so the save captures the pre-loop value before the
set overwrites it) and the restore is sinked to the exit block. The loop
body then always sees the loop-invariant value with no per-iteration
save/restore overhead. Safety checks ensure no uses of the register
appear before the save, after the restore, or in non-latch blocks of
the loop.
Tests are added for all three cases, including a regression test that
verifies brackets with uses outside the [set, restore] window are
correctly rejected.