Skip to content

Add SwiGLU example (out = SiLU(gate) * up)#16

Merged
erwei-xilinx merged 1 commit into
mainfrom
add-swiglu-example
Mar 10, 2026
Merged

Add SwiGLU example (out = SiLU(gate) * up)#16
erwei-xilinx merged 1 commit into
mainfrom
add-swiglu-example

Conversation

@erwei-xilinx

@erwei-xilinx erwei-xilinx commented Mar 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Adds new examples/swiglu/ with Triton kernel, AIE2 and AIE2P transform scripts, and extern_func.o
  • Computes SwiGLU(gate, up) = gate * sigmoid(gate) * up with bf16 inputs/output
  • Extends the silu elementwise pattern with a second input stream (3-operand tiling: gate, up, out)
  • AIE2P uses native bf16 exp intrinsic; AIE2 links extern_func.o for math.exp

Test plan

  • Verified on NPU2 (Strix/AIE2P) hardware, all 6 sizes (2^10 through 2^15) pass correctness against torch.nn.functional.silu(gate) * up
  • Verified on NPU1 (Phoenix/AIE2) hardware, all 6 sizes pass
  • CI build validation
  • Auto-discovered by scripts/run_tests.py (not in skip list)

Generated with Claude Code

New elementwise example computing SwiGLU activation with two input
streams, verified on NPU2 (Strix/AIE2P) hardware. Extends the silu
pattern with 3-operand tiling (gate, up, out). AIE2 transform links
extern_func.o for math.exp; AIE2P uses native bf16 exp intrinsic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@erwei-xilinx erwei-xilinx merged commit 2715977 into main Mar 10, 2026
8 of 9 checks passed
@erwei-xilinx erwei-xilinx deleted the add-swiglu-example branch March 10, 2026 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant