Building and releasing sglang-kernel as a wheel is a part of the release workflow. Check release-whl-kernel.yml for details.
3rdparty/amd/wheel/sglang/pyproject.toml is the AMD-specific pyproject for building the amd-sglang wheel. It extends python/pyproject_other.toml with two ROCm-version extras (rocm700, rocm720) that pin the matching torch/triton/torchaudio/torchvision/sglang-kernel wheels, and renames the package to amd-sglang.
$ git clone https://github.com/sgl-project/sglang.git && cd sglang
$ cp 3rdparty/amd/wheel/sglang/pyproject.toml python/pyproject.toml
$ cd python && python -m build
ROCm 7.0.0:
pip uninstall sglang-kernel sglang amd-sglang
pip install "amd-sglang[all-hip,rocm700]" -i https://pypi.amd.com/rocm-7.0.0/simple --extra-index-url https://pypi.org/simple
ROCm 7.2.0:
pip uninstall sglang-kernel sglang amd-sglang
pip install "amd-sglang[all-hip,rocm720]" -i https://pypi.amd.com/rocm-7.2.0/simple --extra-index-url https://pypi.org/simple
Note: You must resolve the two dependencies, AITER and triton, below. Others are optional depending on your applications.
The amd-sglang wheel includes JIT (Just-In-Time) kernel compilation support. JIT kernels allow for dynamic compilation of optimized CUDA/HIP kernels at runtime.
JIT kernel compilation requires:
- apache-tvm-ffi - Included in the
runtime_commondependencies (installed withamd-sglang[all-hip,...]) - System compiler toolchain - A C++ compiler compatible with your ROCm installation
- For ROCm environments, this is typically provided by the ROCm installation
- Ensure
hipccis available in your PATH
The JIT kernel source files (.cuh, .cu headers) are bundled with the wheel and will be available at runtime for compilation.
To verify JIT kernel support is working:
from sglang.jit_kernel.utils import KERNEL_PATH
print(f"JIT kernel path: {KERNEL_PATH}")
# Should print the path to site-packages/sglang/jit_kernelAITER is a fundamental dependency. Wheel-izing it is ongoing. Until we can pin it reliably, install it manually (typically following the ROCm docker recipe.
To avoid known issues in triton 3.5.1 installed by default, we recommend upgrading triton after installation. In ROCm 7.0.0 environment,
pip install triton==3.6.0
or ROCm 7.2.0,
pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/triton-3.6.0%2Brocm7.2.0.gitba5c1517-cp310-cp310-linux_x86_64.whl
torch._inductor.exc.InductorError: AttributeError: 'KernelMetadata' object has no attribute 'cluster_dims'
After upgrading, you may hit this error during inference when PyTorch Inductor interacts with Triton metadata.
A pragmatic workaround is to guard the metadata access in Inductor's Triton heuristics so it only reads cluster_dims when the attribute exists:
--- a/opt/venv/lib/python3.10/site-packages/torch/_inductor/runtime/triton_heuristics.py
+++ b/opt/venv/lib/python3.10/site-packages/torch/_inductor/runtime/triton_heuristics.py
@@ -1759,6 +1759,8 @@
else (
(binary.metadata.num_ctas, *binary.metadata.cluster_dims)
if hasattr(binary, "metadata")
+ and hasattr(binary.metadata, "num_ctas")
+ and hasattr(binary.metadata, "cluster_dims")
else ()
)
),Install sgl-model-gateway as follows:
$ apt install openssl libssl-dev protobuf
$ export PATH="/$HOME/.cargo/bin:${PATH}" \
&& curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y \
&& rustc --version && cargo --version # Prepare for a rust toolchain
$ python3 -m pip install --no-cache-dir setuptools-rust \
&& cd /sgl-workspace/sglang/sgl-model-gateway/bindings/python \
&& cargo build --release \
&& python3 -m pip install --no-cache-dir . \
&& rm -rf /root/.cache # Build and install sgl-model-gateway