Skip to content

Add get_npu_cache_dir() to expose NPU binary cache path#45

Merged
erwei-xilinx merged 3 commits into
mainfrom
feature/expose-npu-cache-dir
Apr 11, 2026
Merged

Add get_npu_cache_dir() to expose NPU binary cache path#45
erwei-xilinx merged 3 commits into
mainfrom
feature/expose-npu-cache-dir

Conversation

@erwei-xilinx

Copy link
Copy Markdown
Collaborator

Summary

  • Add get_npu_cache_dir(compiled_kernel) utility function that returns the path to the NPU binary cache directory containing aie.xclbin/aie.elf, insts.bin, and __npu_dispatch.so
  • Add on_cache_resolved callback mechanism in compile_module() so NPULauncher captures the cache directory on first kernel launch
  • Only amd_triton_npu/backend/driver.py is modified

Usage

from triton.backends.amd_triton_npu.driver import get_npu_cache_dir

compiled_kernel = my_kernel[grid](a, b, c, N, BLOCK_SIZE_N=1024)
npu_cache = get_npu_cache_dir(compiled_kernel)
# npu_cache == "/home/user/.triton/cache/XXXXX/"
# Contains: __npu_dispatch.so, aie.xclbin (or aie.elf), insts.bin

Closes #41

Test plan

  • Basic API tests pass (None input, missing _run, wrong _run type raises TypeError)
  • Triton compile stage (TTIR → ttsharedir) works, returns None before launch
  • Hardware test on npu1 (Phoenix): get_npu_cache_dir returns valid cache directory with aie.xclbin, insts.bin, __npu_dispatch.so
  • Verified on both cache-miss (first compile) and cache-hit paths
  • All npu1 examples pass: relu, axpy, sigmoid, vec-add, matmul_bf16

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 9, 2026 23:29
The NPU backend stores hardware artifacts (xclbin/elf, insts.bin,
dispatch .so) in a separate triton cache directory from the main
compiler cache. Users had no way to programmatically retrieve this
path after compilation.

Add a callback mechanism in compile_module() and expose it through
a public get_npu_cache_dir() utility function that returns the cache
directory given a CompiledKernel.

Closes #41

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR exposes the NPU backend’s binary cache directory for a given compiled Triton kernel by capturing the cache directory during first launch and providing a helper to retrieve it from the compiled kernel object.

Changes:

  • Extend compile_module() with an on_cache_resolved callback invoked with the NPU cache directory.
  • Store the resolved cache directory on NPULauncher instances at launch time.
  • Add get_npu_cache_dir(compiled_kernel) utility to return the captured cache directory (or None before first launch).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread amd_triton_npu/backend/driver.py
Comment thread amd_triton_npu/backend/driver.py Outdated
Comment thread amd_triton_npu/backend/driver.py Outdated
erwei-xilinx and others added 2 commits April 9, 2026 16:36
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- List per-format artifacts (xclbin vs elf) instead of a single list
- Clarify when None is returned vs when TypeError is raised

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@ypapadop-amd ypapadop-amd left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@erwei-xilinx erwei-xilinx merged commit 9db6dee into main Apr 11, 2026
8 of 9 checks passed
@erwei-xilinx erwei-xilinx deleted the feature/expose-npu-cache-dir branch April 11, 2026 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Mechanism to give the path to the cached xclbin / elf

3 participants