Skip to content
View zhouyou9505's full-sized avatar

Organizations

@llm-d

Block or report zhouyou9505

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zhouyou9505/README.md

Hi, I'm zhouyou

Senior AI Infrastructure & Cloud-Native Engineer | Shanghai, China

Email

zhouyou9505@gmail.com


Profile

A senior engineer with 10+ years of experience in high-concurrency backend systems and cloud-native infrastructure. Currently focused on Cloud-Native AI Infrastructure and Distributed LLM Inference Systems, with hands-on experience across Kubernetes-based orchestration, model serving, observability, and performance optimization.

🚀 Core Focus

Cloud-Native AI Infrastructure: Designing and operating scalable infrastructure for AI workloads, including scheduling, networking, observability, and platform reliability.

Distributed LLM Inference: Working on production inference systems, including PD disaggregation, cache-aware routing, workload orchestration, autoscaling, and runtime optimization.

High-Performance AI Networking: Exploring topology-aware scheduling and network-aware optimization for LLM inference, including RoCE, InfiniBand, and related high-performance communication patterns.

Observability & Platform Reliability: Building monitoring, tracing, metrics, and dashboarding capabilities for cloud-native and AI infrastructure using OpenTelemetry, Prometheus, and Grafana.

🎯 Open Source & Tech Stack

llm-d logo llm-d: Active contributor to the llm-d open-source ecosystem, focusing on distributed LLM inference, cloud-native orchestration, and production-grade AI infrastructure.


Toolbox: Go, Python | Kubernetes, KubeRay, Istio | vLLM, Ray | OpenTelemetry, Prometheus, Grafana.


📫 Connect: Active in AI Infra developer communities.

Optimizing infrastructure so that intelligence can scale freely.

Pinned Loading

  1. llm-d/llm-d-router llm-d/llm-d-router Public

    llm-d Router: The intelligent entry point for inference requests

    Go 236 254

  2. llm-d/llm-d-kv-cache llm-d/llm-d-kv-cache Public

    Distributed KV cache scheduling & offloading libraries

    Go 157 143

  3. llm-d/llm-d-benchmark llm-d/llm-d-benchmark Public

    llm-d benchmark scripts and tooling

    Python 61 99