Skip to content
Change the repository type filter

All

    Repositories list

    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      3.3k20k526806Updated Nov 4, 2025Nov 4, 2025
    • sgl-project.github.io

      Public
      This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.
      HTML
      218880Updated Nov 4, 2025Nov 4, 2025
    • SpecForge

      Public
      Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
      Python
      1034544413Updated Nov 3, 2025Nov 3, 2025
    • Fast and memory-efficient exact attention
      Python
      2.1k1300Updated Nov 3, 2025Nov 3, 2025
    • sglang-jax

      Public
      JAX backend for SGL
      Python
      241092610Updated Nov 3, 2025Nov 3, 2025
    • sgl-kernel-npu

      Public
      SGLang kernel library for NPU
      C++
      4465614Updated Nov 3, 2025Nov 3, 2025
    • ome

      Public
      OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
      Go
      433033121Updated Nov 3, 2025Nov 3, 2025
    • whl

      Public
      Kernel Library Wheel for SGLang
      HTML
      41500Updated Nov 2, 2025Nov 2, 2025
    • rbg

      Public
      A workload for deploying LLM inference services on Kubernetes
      Go
      2494105Updated Nov 2, 2025Nov 2, 2025
    • sgl-kernel-xpu

      Public
      SGLang kernel library for Intel XPU
      Python
      1111011Updated Oct 31, 2025Oct 31, 2025
    • sgl-cookbook

      Public
      Make SGLang go brrr
      84012Updated Oct 28, 2025Oct 28, 2025
    • sgl-learning-materials

      Public
      Materials for learning SGLang
      4962900Updated Oct 26, 2025Oct 26, 2025
    • genai-bench

      Public
      Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.
      Python
      2922755Updated Oct 24, 2025Oct 24, 2025
    • sgl-test-files

      Public
      The test files for SGLang.
      2101Updated Oct 22, 2025Oct 22, 2025
    • DeepGEMM

      Public
      DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
      Cuda
      7342101Updated Oct 22, 2025Oct 22, 2025
    • FlashMLA

      Public
      FlashMLA: Efficient Multi-head Latent Attention Kernels
      C++
      896000Updated Oct 20, 2025Oct 20, 2025
    • fast-hadamard-transform

      Public
      Fast Hadamard transform in CUDA, with a PyTorch interface
      C
      47000Updated Oct 15, 2025Oct 15, 2025
    • sgl-whl

      Public
      SGLang wheels for multiple platforms
      11100Updated Oct 13, 2025Oct 13, 2025
    • tensorrt-demo

      Public
      TensorRT LLM Benchmark Configuration
      Python
      41300Updated Jul 26, 2024Jul 26, 2024