Skip to content

Add weighted RMS normalization example#20

Merged
erwei-xilinx merged 2 commits into
mainfrom
add-weighted-rms-norm-example
Mar 12, 2026
Merged

Add weighted RMS normalization example#20
erwei-xilinx merged 2 commits into
mainfrom
add-weighted-rms-norm-example

Conversation

@erwei-xilinx

@erwei-xilinx erwei-xilinx commented Mar 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add weighted RMS norm kernel: y = x * rsqrt(mean(x^2) + eps) * w
  • Extends rms_norm with a learned weight vector (broadcast operand)
  • AIE2P-only transform script (uses arith.divf which requires f32 vectors)
  • Update mlir-air to 98f2fc3 (includes PR #1412 for cross-op buffer sharing)
  • Add weighted_rms_norm to generate_readme.py dashboard

Test plan

  • Tested on NPU2 (Strix) hardware with M=32,64 N=256
  • Exit code 0, no assertion failures
  • Existing examples (vec-add, relu, sigmoid, average_pool, rms_norm) still pass with new wheel
  • CI build and test

🤖 Generated with Claude Code

erwei-xilinx and others added 2 commits March 11, 2026 23:02
Weighted RMS norm: y = x * rsqrt(mean(x^2) + eps) * w

Extends rms_norm by multiplying each normalized element by a learned
weight vector. Uses BLOCK_M=2 (2D tiling) with 3 memref arguments
(X, W, Y) where W has broadcast indexing.

The transform script relies on mlir-air PR #1412 (cross-op buffer
sharing in linalg_promote) to share the X subview buffer across the
squaring and output generics, keeping DMA count within AIE tile
limits.

Update mlir-air to 98f2fc3 which includes all necessary fixes:
- PR #1407: broadcast operand promotion
- PR #1408: dead memref.global cleanup
- PR #1411: memory space comparison fix
- PR #1412: cross-op promotedValueMap with DominanceInfo

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reviewdog v0.20.3 (used by action-suggester v1.22) was removed from
GitHub releases, causing both clang-format and black check steps to
fail with "unable to find 'v0.20.3'".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@erwei-xilinx erwei-xilinx merged commit c6d72b2 into main Mar 12, 2026
7 of 9 checks passed
@erwei-xilinx erwei-xilinx deleted the add-weighted-rms-norm-example branch March 12, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant