Is sglang's Scheduling Scope Designed for Cluster-Level Management (AIBrix + vllm Style)? #5130
CormickKneey
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi ~
I've been exploring sglang and noticed that its design incorporates features like radix-cache scheduling and kv-cache transfer(in pd disaggregation). These aspects suggest a robust caching and scheduling mechanism that might extend beyond single-node optimizations.
I'm curious: is sglang's scheduling scope intended to operate at a cluster level? In other words, is the project positioned to provide a solution similar to an AIBrix combined with vllm setup—managing high-throughput inference and caching efficiently across multiple nodes?
Any insights or clarifications regarding the long-term vision for distributed scheduling in sglang would be greatly appreciated. 😄😄
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions