Skip to content

Commit 0f2c2d2

Browse files
authored
Merge pull request #4 from erwei-xilinx/update-readme-and-fix-urls
Update README with project overview and fix obsolete URLs
2 parents 0ebbd7d + 7c75102 commit 0f2c2d2

5 files changed

Lines changed: 40 additions & 14 deletions

File tree

README.md

Lines changed: 32 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,38 @@
11
# Triton-XDNA
2-
This repository contains a plugin for building AIR as Triton's compiler backend.
2+
3+
**An experimental open-source project demonstrating compiler-driven kernel generation for AMD XDNA NPUs using [Triton](https://github.com/triton-lang/triton) and [MLIR-AIR](https://github.com/Xilinx/mlir-air).**
4+
5+
Triton-XDNA provides an end-to-end compilation flow that lowers standard Triton kernels directly to AMD NPU hardware — no prebuilt kernel libraries required. It bridges Triton's high-level parallel programming model with AMD's MLIR-AIR/AIE compilation stack, producing XRT-compatible binaries for AMD AI Engine architectures (AIE2 and AIE2P).
6+
7+
### How it works
8+
9+
Triton kernels are first lowered to compact Linalg compute graphs via [triton-shared](https://github.com/microsoft/triton-shared), then tiled and mapped onto parallel NPU cores using the MLIR Transform dialect, and finally compiled through [MLIR-AIR](https://github.com/Xilinx/mlir-air) and [MLIR-AIE](https://github.com/Xilinx/mlir-aie) to produce device binaries.
10+
11+
```
12+
Triton kernel (@triton.jit)
13+
-> triton-shared (Linalg)
14+
-> MLIR Transform dialect (tiling, bufferization, vectorization)
15+
-> MLIR-AIR / MLIR-AIE
16+
-> XRT binary (aie.xclbin)
17+
```
18+
19+
### Key results
20+
21+
- For dense matrix multiplication (I8/I16/BF16), compiler-generated kernels achieve **performance parity with handwritten NPU implementations**
22+
- Over **90% of tested matmul configurations reach at least 90% of baseline throughput**; no configuration falls below 80%
23+
- Currently supports matrix multiplication, elementwise operations, softmax, and layer normalization
24+
- Complex compute graphs with reductions and broadcasts are mapped onto parallel NPU tiles
25+
26+
### Contributing
27+
28+
This is an experimental project and we welcome community contributions. Whether it's adding support for new kernel types, improving performance, or extending platform support — we'd love to collaborate.
329

430
## Usage
531

632
### Clone the repository
733
```
8-
git clone https://github.com/AARInternal/triton-xdna.git
9-
cd triton-xdna
34+
git clone https://github.com/amd/Triton-XDNA.git
35+
cd Triton-XDNA
1036
git submodule update --init
1137
```
1238

@@ -27,7 +53,7 @@ python3 -m pip install --upgrade pip
2753

2854
# Install triton-xdna from GitHub Releases
2955
pip install triton-xdna \
30-
--find-links https://github.com/AARInternal/triton-xdna/releases/expanded_assets/latest-wheels \
56+
--find-links https://github.com/amd/Triton-XDNA/releases/expanded_assets/latest-wheels \
3157
--find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti \
3258
--find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly \
3359
--find-links https://github.com/Xilinx/mlir-air/releases/expanded_assets/latest-air-wheels-no-rtti
@@ -52,7 +78,7 @@ python3 -m pip install --upgrade pip
5278
pip install cmake pybind11 nanobind wheel ninja pytest setuptools Cython
5379

5480
# Install triton-xdna from source and all dependencies automatically
55-
pip install . \
81+
pip install . --no-build-isolation \
5682
--find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti \
5783
--find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly \
5884
--find-links https://github.com/Xilinx/mlir-air/releases/expanded_assets/latest-air-wheels-no-rtti
@@ -74,7 +100,7 @@ python3 -m pip install --upgrade pip
74100
pip install cmake pybind11 nanobind wheel ninja pytest setuptools Cython
75101
source utils/env_setup.sh
76102

77-
cmake cmake -GNinja -S . -Bbuild
103+
cmake -GNinja -S . -Bbuild
78104
cd build
79105
ninja
80106
```

ci/docker-based/loop_docker_ci.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@
66
set -x
77

88
IMAGE_NAME="triton-xdna-public-dev-github-runner"
9-
GITHUB_OWNER="AARInternal"
10-
GITHUB_REPO="triton-xdna"
9+
GITHUB_OWNER="amd"
10+
GITHUB_REPO="Triton-XDNA"
1111

1212
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
1313
GITHUB_PAT=$(cat "${SCRIPT_DIR}/secret_github_token")

ci/docker-based/test_docker_ci.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
# SPDX-License-Identifier: MIT
55

66
IMAGE_NAME="triton-xdna-public-dev-github-runner"
7-
GITHUB_OWNER="AARInternal"
8-
GITHUB_REPO="triton-xdna"
7+
GITHUB_OWNER="amd"
8+
GITHUB_REPO="Triton-XDNA"
99

1010
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
1111
GITHUB_PAT=$(cat "${SCRIPT_DIR}/secret_github_token")

pyproject.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,14 @@ classifiers = [
2626

2727
# Installation command:
2828
# pip install triton-xdna \
29-
# --find-links https://github.com/AARInternal/triton-xdna/releases/expanded_assets/latest-wheels \
29+
# --find-links https://github.com/amd/Triton-XDNA/releases/expanded_assets/latest-wheels \
3030
# --find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-no-rtti \
3131
# --find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly \
3232
# --find-links https://github.com/Xilinx/mlir-air/releases/expanded_assets/latest-air-wheels-no-rtti
3333

3434
[project.urls]
35-
Homepage = "https://github.com/AARInternal/triton-xdna"
36-
Repository = "https://github.com/AARInternal/triton-xdna.git"
35+
Homepage = "https://github.com/amd/Triton-XDNA"
36+
Repository = "https://github.com/amd/Triton-XDNA.git"
3737

3838
[project.entry-points."triton.backends"]
3939
amd_triton_npu = "triton.backends.amd_triton_npu"

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -805,7 +805,7 @@ def run(self):
805805
description="Triton compiler with MLIR-AIR backend for AMD NPU devices",
806806
long_description=(BASE_DIR / "README.md").read_text(),
807807
long_description_content_type="text/markdown",
808-
url="https://github.com/AARInternal/triton-xdna",
808+
url="https://github.com/amd/Triton-XDNA",
809809
license="MIT",
810810
packages=[], # No packages - we build from triton
811811
install_requires=get_install_requires(),

0 commit comments

Comments
 (0)