Skip to content

Latest commit

 

History

History
257 lines (174 loc) · 14 KB

File metadata and controls

257 lines (174 loc) · 14 KB

SIG Overview

Special Interest Groups (SIGs) are the primary organizational units for coordinating work across the llm-d project. Each SIG focuses on a specific area of the project's technology stack and is responsible for driving design, implementation, and maintenance of their respective components.

SIGs provide a mechanism for:

  • Focused expertise: Bringing together contributors with specialized knowledge in specific areas
  • Coordinated development: Ensuring consistent architectural decisions across related components
  • Community building: Creating smaller, more manageable groups for collaboration and mentorship
  • Accountability: Clear ownership and responsibility for specific project areas

SIG Structure and Governance

SIG Leadership

Each SIG has:

  • SIG Leads (2-3 people): Responsible for overall SIG direction, coordination, and decision-making

SIG Responsibilities

  • Drive technical design and implementation in their area
  • Maintain documentation and architectural decisions
  • Coordinate with other SIGs on cross-cutting concerns
  • Mentor new contributors and grow the community
  • Participate in project-wide planning and releases

SIG Meetings

  • Regular meetings (typically weekly) for technical discussions

Relationship to Project Governance

SIGs operate within the broader llm-d project governance framework defined in PROJECT.md:

  • SIGs follow the project's lazy consensus decision-making process
  • Major cross-SIG decisions require project maintainer approval
  • All SIG work follows the project's contribution guidelines

Active Special Interest Groups

For up-to-date meeting times, see the Public Meeting Calendar.

SIG Focus Area Documentation
SIG Inference Scheduler Intelligent request routing, load balancing, and traffic management Meeting Recordings and Docs
llm-d-inference-scheduler Repository
SIG Benchmarking Performance testing, benchmarking frameworks, and optimization Meeting Recordings and Docs
llm-d-benchmark Repository
SIG PD-Disaggregation Prefill/decode separation, distributed serving, and workload disaggregation Meeting Recordings and Docs
llm-d-routing-sidecar Repository
SIG KV-Disaggregation KV caching, prefix caching, and distributed storage systems Meeting Recordings and Docs
llm-d-kv-cache Repository
SIG Installation Kubernetes integration, deployment tooling, and platform operations Meeting Recordings and Docs
llm-d-modelservice Repository
llm-d-infra Repository
SIG Autoscaling Traffic-aware autoscaling, resource management, and capacity planning Meeting Recordings and Docs
workload-variant-autoscaler Repository
SIG Observability Monitoring, logging, metrics, and operational visibility Meeting Recordings and Docs
llm-d Observability Documentation

SIG Detailed Descriptions

SIG Inference Scheduler

👥 Leadership: Nili Guy, Abdullah Gharaibeh, Vita Bortnikov

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Develop and maintain intelligent request routing and load balancing systems that optimize for latency, throughput, and resource utilization across distributed inference workloads.

Key Areas:

  • vLLM-optimized inference scheduling algorithms
  • KV-cache aware routing and load balancing
  • Integration with Kubernetes Gateway API and Inference Gateway Extension
  • Flow control and traffic shaping
  • SLA-aware request prioritization

💬 Communication:

SIG Benchmarking

👥 Leadership: Marcio A L Silva, Ashok Chandrasekar

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Establish comprehensive performance testing and benchmarking frameworks to ensure llm-d delivers optimal performance across diverse workloads and hardware configurations.

Key Areas:

  • Benchmarking frameworks and methodologies
  • Performance regression testing
  • Workload simulation and synthetic data generation
  • Hardware-specific optimization
  • Performance analysis and profiling tools

💬 Communication:

SIG PD-Disaggregation

👥 Leadership: Robert Shaw, Tyler Michael Smith

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Design and implement prefill/decode disaggregation patterns that enable efficient separation of inference workloads across heterogeneous hardware and scaling requirements.

Key Areas:

  • Prefill/decode workload separation
  • Disaggregated serving architecture
  • Cross-instance communication protocols
  • Heterogeneous hardware optimization
  • Dynamic workload balancing between Prefill and Decode instances

💬 Communication:

SIG KV-Disaggregation

👥 Leadership: Maroon Ayoub, Danny Harnik

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Design and implement distributed KV caching solutions that improve inference performance through intelligent cache management, prefix sharing, and disaggregated storage.

Key Areas:

  • Distributed KV cache architecture
  • Prefix cache hierarchies (local, remote, shared)
  • Cache-aware scheduling and routing
  • Storage optimization for inference workloads
  • Integration with vLLM's KVConnector

💬 Communication:

SIG Installation

👥 Leadership: Brent Salisbury, Greg Pereira

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Ensure llm-d integrates seamlessly with Kubernetes and provides robust deployment, scaling, and operational capabilities for production environments.

Key Areas:

  • Kubernetes-native deployment patterns
  • Helm charts and operators
  • Installation and configuration management
  • Multi-node orchestration with LeaderWorkerSet
  • Platform integration and operational best practices

💬 Communication:

SIG Autoscaling

👥 Leadership: Tamar Eilam, Abhishek Malvankar

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Develop intelligent autoscaling solutions that automatically adjust llm-d deployments based on traffic patterns, workload characteristics, and hardware utilization.

Key Areas:

  • Traffic-aware autoscaling algorithms
  • Hardware-specific scaling policies
  • Workload-based capacity planning
  • Integration with Kubernetes HPA/VPA
  • Cost-optimized scaling strategies

💬 Communication:

SIG Observability

👥 Leadership: Sally O'Malley, Roy Nissim, Benedikt Bongartz

⭐️ North Star Design Document ↗️ (Google Docs)

Charter: Provide comprehensive monitoring, logging, and observability capabilities that enable operators to understand system behavior, diagnose issues, and optimize performance.

Key Areas:

  • Metrics collection and visualization
  • Distributed tracing and logging
  • Performance monitoring and alerting
  • Operational dashboards and reporting
  • Integration with monitoring ecosystems (Prometheus, Grafana, etc.)

💬 Communication:

Getting Involved

Joining a SIG

  1. Attend a meeting: Check the project calendar for SIG meeting times
  2. Join the conversation: Participate in SIG-specific channels on Slack
  3. Review documentation: Read the SIG's charter and current initiatives
  4. Start contributing: Look for "good first issues" labeled with the SIG's area

SIG Communication Channels

SIG Formation and Evolution

Creating a New SIG

  1. Identify need: Demonstrate community interest and technical necessity
  2. Draft charter: Define scope, goals, and initial leadership
  3. Proposal process: Submit proposal following project contribution guidelines
  4. Community review: Present at weekly project standup and gather feedback
  5. Approval: Obtain approval from project maintainers

SIG Lifecycle Management

  • Active: Regular meetings, active development, engaged community
  • Maintenance: Limited active development, focus on stability and bug fixes
  • Archived: No longer active, historical reference only

SIGs may evolve, merge, or be archived based on project needs and community engagement.

Resources

Maintenance

This document is maintained by the project maintainers and updated as SIGs evolve. For questions or suggestions about SIG structure, please reach out via: