Skip to content

Add AMD_TRITON_NPU_TARGET env var for cross-compilation#46

Merged
erwei-xilinx merged 1 commit into
mainfrom
add-npu-target-env-var
Apr 11, 2026
Merged

Add AMD_TRITON_NPU_TARGET env var for cross-compilation#46
erwei-xilinx merged 1 commit into
mainfrom
add-npu-target-env-var

Conversation

@erwei-xilinx

@erwei-xilinx erwei-xilinx commented Apr 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Adds AMD_TRITON_NPU_TARGET environment variable (npu1 or npu2) that overrides hardware detection in detect_npu_version(), skipping the xrt-smi query entirely
  • Enables AMD_TRITON_NPU_COMPILE_ONLY=1 to work on machines without a matching local NPU device (cross-compilation)
  • Invalid values produce a clear RuntimeError with supported options

Closes #42

Test plan

  • Verify AMD_TRITON_NPU_TARGET=npu2 AMD_TRITON_NPU_COMPILE_ONLY=1 python examples/vec-add/vec-add.py compiles without querying xrt-smi
  • Verify AMD_TRITON_NPU_TARGET=npu3 raises RuntimeError with a clear message
  • Verify no regression: without AMD_TRITON_NPU_TARGET set, behavior is unchanged (auto-detect via xrt-smi)
  • Verify AMD_TRITON_NPU_TARGET=npu1 AMD_TRITON_NPU_OUTPUT_FORMAT=elf still raises the "ELF not supported on npu1" error

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 9, 2026 23:54
When set to 'npu1' or 'npu2', detect_npu_version() returns the
specified target directly without querying xrt-smi. This enables
AMD_TRITON_NPU_COMPILE_ONLY to work on machines without a matching
local NPU device.

Closes #42

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an explicit NPU target override via environment variable to enable cross-compilation workflows without requiring local NPU hardware detection.

Changes:

  • Introduces AMD_TRITON_NPU_TARGET to override detect_npu_version() and bypass xrt-smi device querying.
  • Validates AMD_TRITON_NPU_TARGET against supported internal NPU versions and raises a clear RuntimeError on invalid values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@erwei-xilinx erwei-xilinx merged commit df410ad into main Apr 11, 2026
12 of 13 checks passed
@erwei-xilinx erwei-xilinx deleted the add-npu-target-env-var branch April 11, 2026 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Mechanism to avoid querying the NPU type

3 participants