Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 32 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,38 @@
# Triton-XDNA
This repository contains a plugin for building AIR as Triton's compiler backend.

**An experimental open-source project demonstrating compiler-driven kernel generation for AMD XDNA NPUs using [Triton](https://github.com/triton-lang/triton) and [MLIR-AIR](https://github.com/Xilinx/mlir-air).**

Triton-XDNA provides an end-to-end compilation flow that lowers standard Triton kernels directly to AMD NPU hardware — no prebuilt kernel libraries required. It bridges Triton's high-level parallel programming model with AMD's MLIR-AIR/AIE compilation stack, producing XRT-compatible binaries for AMD AI Engine architectures (AIE2 and AIE2P).

### How it works

Triton kernels are first lowered to compact Linalg compute graphs via [triton-shared](https://github.com/microsoft/triton-shared), then tiled and mapped onto parallel NPU cores using the MLIR Transform dialect, and finally compiled through [MLIR-AIR](https://github.com/Xilinx/mlir-air) and [MLIR-AIE](https://github.com/Xilinx/mlir-aie) to produce device binaries.

```
Triton kernel (@triton.jit)
-> triton-shared (Linalg)
-> MLIR Transform dialect (tiling, bufferization, vectorization)
-> MLIR-AIR / MLIR-AIE
-> XRT binary (aie.xclbin)
```

### Key results

- For dense matrix multiplication (I8/I16/BF16), compiler-generated kernels achieve **performance parity with handwritten NPU implementations**
- Over **90% of tested matmul configurations reach at least 90% of baseline throughput**; no configuration falls below 80%
- Currently supports matrix multiplication, elementwise operations, softmax, and layer normalization
- Complex compute graphs with reductions and broadcasts are mapped onto parallel NPU tiles

### Contributing

This is an experimental project and we welcome community contributions. Whether it's adding support for new kernel types, improving performance, or extending platform support — we'd love to collaborate.

## Usage

### Clone the repository
```
git clone https://github.com/AARInternal/triton-xdna.git
cd triton-xdna
git clone https://github.com/amd/Triton-XDNA.git
cd Triton-XDNA
git submodule update --init
```

Expand All @@ -27,7 +53,7 @@ python3 -m pip install --upgrade pip

# Install triton-xdna from GitHub Releases
pip install triton-xdna \
--find-links https://github.com/AARInternal/triton-xdna/releases/expanded_assets/latest-wheels \
--find-links https://github.com/amd/Triton-XDNA/releases/expanded_assets/latest-wheels \
--find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti \
--find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly \
--find-links https://github.com/Xilinx/mlir-air/releases/expanded_assets/latest-air-wheels-no-rtti
Expand All @@ -52,7 +78,7 @@ python3 -m pip install --upgrade pip
pip install cmake pybind11 nanobind wheel ninja pytest setuptools Cython

# Install triton-xdna from source and all dependencies automatically
pip install . \
pip install . --no-build-isolation \
--find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti \
--find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly \
--find-links https://github.com/Xilinx/mlir-air/releases/expanded_assets/latest-air-wheels-no-rtti
Expand All @@ -74,7 +100,7 @@ python3 -m pip install --upgrade pip
pip install cmake pybind11 nanobind wheel ninja pytest setuptools Cython
source utils/env_setup.sh

cmake cmake -GNinja -S . -Bbuild
cmake -GNinja -S . -Bbuild
cd build
ninja
```
Expand Down
4 changes: 2 additions & 2 deletions ci/docker-based/loop_docker_ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
set -x

IMAGE_NAME="triton-xdna-public-dev-github-runner"
GITHUB_OWNER="AARInternal"
GITHUB_REPO="triton-xdna"
GITHUB_OWNER="amd"
GITHUB_REPO="Triton-XDNA"

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
GITHUB_PAT=$(cat "${SCRIPT_DIR}/secret_github_token")
Expand Down
4 changes: 2 additions & 2 deletions ci/docker-based/test_docker_ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
# SPDX-License-Identifier: MIT

IMAGE_NAME="triton-xdna-public-dev-github-runner"
GITHUB_OWNER="AARInternal"
GITHUB_REPO="triton-xdna"
GITHUB_OWNER="amd"
GITHUB_REPO="Triton-XDNA"

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
GITHUB_PAT=$(cat "${SCRIPT_DIR}/secret_github_token")
Expand Down
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ classifiers = [

# Installation command:
# pip install triton-xdna \
# --find-links https://github.com/AARInternal/triton-xdna/releases/expanded_assets/latest-wheels \
# --find-links https://github.com/amd/Triton-XDNA/releases/expanded_assets/latest-wheels \
# --find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti \
# --find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly \
# --find-links https://github.com/Xilinx/mlir-air/releases/expanded_assets/latest-air-wheels-no-rtti

[project.urls]
Homepage = "https://github.com/AARInternal/triton-xdna"
Repository = "https://github.com/AARInternal/triton-xdna.git"
Homepage = "https://github.com/amd/Triton-XDNA"
Repository = "https://github.com/amd/Triton-XDNA.git"

[project.entry-points."triton.backends"]
amd_triton_npu = "triton.backends.amd_triton_npu"
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -797,7 +797,7 @@ def run(self):
description="Triton compiler with MLIR-AIR backend for AMD NPU devices",
long_description=(BASE_DIR / "README.md").read_text(),
long_description_content_type="text/markdown",
url="https://github.com/AARInternal/triton-xdna",
url="https://github.com/amd/Triton-XDNA",
license="MIT",
packages=[], # No packages - we build from triton
install_requires=get_install_requires(),
Expand Down
Loading