Skip to content
@llm-d

llm-d

llm-d enables high performance distributed inference in production on Kubernetes

Welcome to llm-d: a Kubernetes-native high-performance distributed LLM inference framework

GitHub Org's stars Documentation License

Join Slack X (formerly Twitter) Follow Bluesky LinkedIn Reddit YouTube

llm-d is a Kubernetes-native high-performance distributed LLM inference framework that provides the fastest time-to-value and competitive performance per dollar. Built on vLLM, Kubernetes, and Inference Gateway, llm-d offers modular solutions for distributed inference with features like KV-cache aware routing and disaggregated serving.

🚀 Quick Start Guide

New to llm-d? Here's how to get started:

  1. Join our Slack 💬Get your invite and visit llm-d.slack.com
  2. Explore our code 📂GitHub Organization
  3. Join a meeting 📅Add calendar
  4. Pick your area 🎯 → Browse Special Interest Groups.

📚 Key Resources

💬 Communication Channels

🗓️ Regular Meetings

All meetings are open to the public! 🌟

  • 📅 Weekly Standup: Every Wednesday at 12:30pm ET - Project updates and open discussion
  • 🎯 SIG Meetings: Various times throughout the week - See SIG details for schedules

Join to participate, ask questions, or just listen and learn!

🎯 Special Interest Groups (SIGs)

Want to dive deeper into specific areas? Our Special Interest Groups are focused teams working on different aspects of llm-d:

  • Inference Scheduler - Intelligent request routing and load balancing
  • Benchmarking - Performance testing and optimization
  • PD-Disaggregation - Prefill/decode separation patterns
  • KV-Disaggregation - KV caching and distributed storage
  • Installation - Kubernetes integration and deployment
  • Autoscaling - Traffic-aware autoscaling and resource management
  • Observability - Monitoring, logging, and metrics

View more SIG Details →

🤝 How to Contribute

Getting Involved

Contributing Code

  1. Read Guidelines: Review our Code of Conduct and contribution process
  2. Sign Commits: All commits require DCO sign-off (git commit -s)

Ways to Contribute

  • 🐛 Bug fixes and small features - Submit PRs directly to component repos
  • 🚀 New features with APIs - Require project proposals
  • 📚 Documentation - Help improve guides and examples
  • 🧪 Testing & Benchmarking - Contribute to our test coverage
  • 💡 Experimental features - Start in llm-d-incubation org

🔒 Security & Safety

🌐 Connect With Us

Follow llm-d across social platforms for updates, discussions, and community highlights:

❓ Need Help?

Questions? Ideas? Just want to chat? We're here to help! The llm-d community team is friendly and responsive.


License: Apache 2.0

Pinned Loading

  1. llm-d llm-d Public

    Achieve state of the art inference performance with modern accelerators on Kubernetes

    Shell 2.5k 332

  2. llm-d-inference-scheduler llm-d-inference-scheduler Public

    Inference scheduler for llm-d

    Go 135 132

  3. llm-d-kv-cache llm-d-kv-cache Public

    Distributed KV cache scheduling & offloading libraries

    Go 107 91

  4. llm-d-benchmark llm-d-benchmark Public

    llm-d benchmark scripts and tooling

    Python 48 52

Repositories

Showing 10 of 16 repositories
  • llm-d Public

    Achieve state of the art inference performance with modern accelerators on Kubernetes

    llm-d/llm-d’s past year of commit activity
    Shell 2,539 Apache-2.0 333 101 (5 issues need help) 99 Updated Feb 28, 2026
  • llm-d-benchmark Public

    llm-d benchmark scripts and tooling

    llm-d/llm-d-benchmark’s past year of commit activity
    Python 48 Apache-2.0 52 74 (11 issues need help) 6 Updated Feb 27, 2026
  • llm-d-workload-variant-autoscaler Public

    Variant optimization autoscaler for distributed inference workloads

    llm-d/llm-d-workload-variant-autoscaler’s past year of commit activity
    Go 31 Apache-2.0 38 107 (2 issues need help) 23 Updated Feb 27, 2026
  • llm-d-python-template Public template

    Python project template for llm-d repos. Use 'Use this template' to create a new Python project with standard CI, linting, Prow, and governance.

    llm-d/llm-d-python-template’s past year of commit activity
    Makefile 0 Apache-2.0 0 1 3 Updated Feb 27, 2026
  • llm-d-go-template Public template

    Go microservice template for llm-d repos. Use 'Use this template' to create a new Go project with standard CI, linting, Prow, and governance.

    llm-d/llm-d-go-template’s past year of commit activity
    Makefile 0 Apache-2.0 0 1 6 Updated Feb 27, 2026
  • llm-d-kv-cache Public

    Distributed KV cache scheduling & offloading libraries

    llm-d/llm-d-kv-cache’s past year of commit activity
    Go 107 Apache-2.0 91 49 (6 issues need help) 30 Updated Feb 27, 2026
  • llm-d-inference-scheduler Public

    Inference scheduler for llm-d

    llm-d/llm-d-inference-scheduler’s past year of commit activity
    Go 135 Apache-2.0 132 62 (3 issues need help) 19 Updated Feb 27, 2026
  • llm-d-inference-sim Public

    A light weight vLLM simulator, for mocking out replicas.

    llm-d/llm-d-inference-sim’s past year of commit activity
    Go 87 Apache-2.0 62 29 (1 issue needs help) 12 Updated Feb 26, 2026
  • llm-d.github.io Public

    Website for llm-d: This repository builds the website seen at llm-d.ai

    llm-d/llm-d.github.io’s past year of commit activity
    JavaScript 11 24 9 (1 issue needs help) 5 Updated Feb 26, 2026
  • llm-d-infra Public

    repo for CI and infrastructure required to maintain llm-d org member repos

    llm-d/llm-d-infra’s past year of commit activity
    Makefile 0 Apache-2.0 2 10 (1 issue needs help) 3 Updated Feb 26, 2026