Add ELF output format support for NPU2 (AIE2P)#27
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds ELF output support for NPU2 (AIE2P/Strix) to make aie.elf the default binary output, while keeping the existing xclbin workflow for NPU1 (AIE2). It also introduces an environment-variable override to force either format.
Changes:
- Add output-format auto-detection with
AMD_TRITON_NPU_OUTPUT_FORMAToverride (default: ELF on npu2, xclbin on npu1). - Add ELF-specific C++ launcher generation and switch
airccinvocation/caching to produce and loadaie.elf(+ kernel name metadata). - Add a new
CLAUDE.mddevelopment guide covering repo structure and compilation pipeline.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
amd_triton_npu/backend/driver.py |
Implements format selection, ELF launcher path, aircc ELF build flags, and format-aware caching/loading. |
CLAUDE.md |
Adds a large development guide describing the project, build, CI, and compilation pipeline (including ELF on npu2). |
Comments suppressed due to low confidence (1)
amd_triton_npu/backend/driver.py:10
tempfileis imported twice (import tempfileand again inimport os, subprocess, tempfile, platform). This is redundant and can confuse linting/formatting; remove the duplicate import and keep a single style for imports.
import hashlib
import json
import tempfile
import sys
import sysconfig
import os, subprocess, tempfile, platform
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
3bb0c92 to
fd9b976
Compare
ELF is a newer, self-contained XRT binary format that replaces the two-file xclbin+insts.bin workflow. It embeds instructions via PDI and simplifies deployment. ELF is only supported on NPU2 (AIE2P/Strix). NPU2 now defaults to ELF format. NPU1 (AIE2) continues to use xclbin unchanged. The format can be overridden via AMD_TRITON_NPU_OUTPUT_FORMAT. Key changes to driver.py: - Add _get_output_format() for auto-detection and env var override - Add _generate_elf_launcher() using xrt::elf, xrt::ext::kernel, xrt::ext::bo, and xrt::run APIs (no instruction binary needed) - Add _extract_elf_kernel_name() to read kernel name from ELF config - Modify compile_module() for conditional linker flags, aircc flags (--output-format elf --elf-name), caching (2 vs 3 files), and module loading - Modify NPULauncher to dispatch to correct launcher generator Tested: 14/15 examples pass on NPU2 with ELF (matvec fails in both ELF and xclbin modes -- pre-existing issue). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
aie.elfbinaryAMD_TRITON_NPU_OUTPUT_FORMATenv var allows overriding the auto-detected default ("elf"on npu2,"xclbin"on npu1)What changed in
driver.py_get_output_format(): auto-detects format based on NPU version, with env var override and npu1+elf validation_generate_elf_launcher(): C++ launcher template usingxrt::elf,xrt::ext::kernel,xrt::ext::bo, andxrt::runAPIs_extract_elf_kernel_name(): parsesfull_elf_config.jsonto discover the kernel name (e.g.,main:vecadd) since ELF uses IR-derived names instead ofMLIR_AIEcompile_module(): conditional linker flags (ELF doesn't need test_utils/boost), aircc invocation (--output-format elf --elf-name), caching (2 files vs 3), and module loadingNPULauncher.__init__(): dispatches to correct launcher generator based on formatTest plan
AMD_TRITON_NPU_OUTPUT_FORMAT=xclbinon NPU2 — same failure (confirms pre-existing)AMD_TRITON_NPU_OUTPUT_FORMAT=elfon NPU1 — should raise RuntimeError🤖 Generated with Claude Code