-
-
Notifications
You must be signed in to change notification settings - Fork 12k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of python collect_env.py
vllm v0.11.0
🐛 Describe the bug
Hello, I got an issue of scheduler when deploying with PD disaggregation:
Since current scheduling strategy doesn't free blocks occupied by requests with WAITING_FOR_REMOTE_KVS state, will the server stuck in certain scenarios?
For example, in step 4, the secheduler will allocate blocks for request 1 fisrt since it was put back to the front of the waiting queue in step 3. Then request 2 will never get into running queue since it requires more blocks for next token and the scheduler will get stuck in the loop from step 2 to step 4.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working