Skip to content

[AIEX] Implement foldImmediate Peephole optimization#1009

Open
khallouh wants to merge 1 commit into
aie-publicfrom
khallouh.fold.immediates.peephole
Open

[AIEX] Implement foldImmediate Peephole optimization#1009
khallouh wants to merge 1 commit into
aie-publicfrom
khallouh.fold.immediates.peephole

Conversation

@khallouh
Copy link
Copy Markdown
Collaborator

Fold move-immediate + COPY pairs into a single move-immediate at the
consumer site:
%c = MOVA 42 → %a = MOVA 42
%a = COPY %c
Guarded by hasOneNonDBGUse to avoid materializing constants multiple
times when they have several consumers, which would inflate register
pressure and perturb allocation.

Fold move-immediate + COPY pairs into a single move-immediate at the
consumer site:
    %c = MOVA 42       →    %a = MOVA 42
    %a = COPY %c
Guarded by hasOneNonDBGUse to avoid materializing constants multiple
times when they have several consumers, which would inflate register
pressure and perturb allocation.
@khallouh
Copy link
Copy Markdown
Collaborator Author

khallouh commented May 28, 2026

Top improvements Topk1D_int8_0, LayerNormC8Part2_aie2_int8_0 and SinBf16_0 (7%, 5% and 3% respectively in Instruction Count) There are a few regressions but most are small, highest being 1.5% in BufferPadInnermost. These are mostly due to side effects on decisions made by the register allocator, so not really easily recoverable.
image
Edit: Topk1D_int8_0 number is not reliable because its runtime takes random paths in each run, which brings down our top improvement to 5% in LayerNormC8Part2_aie2_int8_0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant