sglang-kernel (prior sgl-kernel)

Building and releasing sglang-kernel as a wheel is a part of the release workflow. Check release-whl-kernel.yml for details.

sglang

3rdparty/amd/wheel/sglang/pyproject.toml is the AMD-specific pyproject for building the amd-sglang wheel. It extends python/pyproject_other.toml with two ROCm-version extras (rocm700, rocm720) that pin the matching torch/triton/torchaudio/torchvision/sglang-kernel wheels, and renames the package to amd-sglang.

Operation to build sglang wheel

$ git clone https://github.com/sgl-project/sglang.git && cd sglang
$ cp 3rdparty/amd/wheel/sglang/pyproject.toml python/pyproject.toml
$ cd python && python -m build

Installation

v0.5.9

ROCm 7.0.0:

pip uninstall sglang-kernel sglang amd-sglang
pip install "amd-sglang[all-hip,rocm700]" -i https://pypi.amd.com/rocm-7.0.0/simple --extra-index-url https://pypi.org/simple

ROCm 7.2.0:

pip uninstall sglang-kernel sglang amd-sglang
pip install "amd-sglang[all-hip,rocm720]" -i https://pypi.amd.com/rocm-7.2.0/simple --extra-index-url https://pypi.org/simple

Note: You must resolve the two dependencies, AITER and triton, below. Others are optional depending on your applications.

JIT Kernel Support

The amd-sglang wheel includes JIT (Just-In-Time) kernel compilation support. JIT kernels allow for dynamic compilation of optimized CUDA/HIP kernels at runtime.

Requirements

JIT kernel compilation requires:

apache-tvm-ffi - Included in the runtime_common dependencies (installed with amd-sglang[all-hip,...])
System compiler toolchain - A C++ compiler compatible with your ROCm installation
- For ROCm environments, this is typically provided by the ROCm installation
- Ensure hipcc is available in your PATH

The JIT kernel source files (.cuh, .cu headers) are bundled with the wheel and will be available at runtime for compilation.

Verification

To verify JIT kernel support is working:

from sglang.jit_kernel.utils import KERNEL_PATH
print(f"JIT kernel path: {KERNEL_PATH}")
# Should print the path to site-packages/sglang/jit_kernel

Manual Dependency Resolution

Resolving AITER

AITER is a fundamental dependency. Wheel-izing it is ongoing. Until we can pin it reliably, install it manually (typically following the ROCm docker recipe.

Revolving triton

To avoid known issues in triton 3.5.1 installed by default, we recommend upgrading triton after installation. In ROCm 7.0.0 environment,

pip install triton==3.6.0

or ROCm 7.2.0,

pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/triton-3.6.0%2Brocm7.2.0.gitba5c1517-cp310-cp310-linux_x86_64.whl

`torch._inductor.exc.InductorError: AttributeError: 'KernelMetadata' object has no attribute 'cluster_dims'`

After upgrading, you may hit this error during inference when PyTorch Inductor interacts with Triton metadata.

A pragmatic workaround is to guard the metadata access in Inductor's Triton heuristics so it only reads cluster_dims when the attribute exists:

--- a/opt/venv/lib/python3.10/site-packages/torch/_inductor/runtime/triton_heuristics.py
+++ b/opt/venv/lib/python3.10/site-packages/torch/_inductor/runtime/triton_heuristics.py
@@ -1759,6 +1759,8 @@
                 else (
                     (binary.metadata.num_ctas, *binary.metadata.cluster_dims)
                     if hasattr(binary, "metadata")
+                    and hasattr(binary.metadata, "num_ctas")
+                    and hasattr(binary.metadata, "cluster_dims")
                     else ()
                 )
             ),

Resolving Dependencies for Distributed Inference

sgl-model-gateway

Install sgl-model-gateway as follows:

$ apt install openssl libssl-dev protobuf
$ export PATH="/$HOME/.cargo/bin:${PATH}" \
  && curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y \
  && rustc --version && cargo --version # Prepare for a rust toolchain
$ python3 -m pip install --no-cache-dir setuptools-rust \
  && cd /sgl-workspace/sglang/sgl-model-gateway/bindings/python \
  && cargo build --release \
  && python3 -m pip install --no-cache-dir . \
  && rm -rf /root/.cache # Build and install sgl-model-gateway

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sglang-kernel (prior sgl-kernel)

sglang

Operation to build sglang wheel

Installation

v0.5.9

JIT Kernel Support

Requirements

Verification

Manual Dependency Resolution

Resolving AITER

Revolving triton

`torch._inductor.exc.InductorError: AttributeError: 'KernelMetadata' object has no attribute 'cluster_dims'`

Resolving Dependencies for Distributed Inference

sgl-model-gateway

Mori

Resolving Dependencies for DeepSeek-V3.2

TileLang

FHT (fast-hadamard-transform)

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

sglang-kernel (prior sgl-kernel)

sglang

Operation to build sglang wheel

Installation

v0.5.9

JIT Kernel Support

Requirements

Verification

Manual Dependency Resolution

Resolving AITER

Revolving triton

torch._inductor.exc.InductorError: AttributeError: 'KernelMetadata' object has no attribute 'cluster_dims'

Resolving Dependencies for Distributed Inference

sgl-model-gateway

Mori

Resolving Dependencies for DeepSeek-V3.2

TileLang

FHT (fast-hadamard-transform)

`torch._inductor.exc.InductorError: AttributeError: 'KernelMetadata' object has no attribute 'cluster_dims'`