[Core][Feat][ safely abort requests where FSM failed to advance#38663
Open
walterbm wants to merge 4 commits intovllm-project:mainfrom
Open
[Core][Feat][ safely abort requests where FSM failed to advance#38663walterbm wants to merge 4 commits intovllm-project:mainfrom
walterbm wants to merge 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: walterbm <walter.beller.morales@gmail.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request implements logic to abort requests when the structured output Finite State Machine (FSM) fails to advance. It introduces a new fsm_failed_to_advance flag in the Request class and updates the scheduler's update_from_output method to detect FSM failures, mark requests as aborted, and ensure they are not resumed. Comprehensive unit tests for both the synchronous and asynchronous schedulers have been added. The review feedback points out a minor redundancy in variable access within the scheduler logic.
Signed-off-by: walterbm <walter.beller.morales@gmail.com>
Signed-off-by: walterbm <walter.beller.morales@gmail.com>
Signed-off-by: walterbm <walter.beller.morales@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Revive #18780 which attempted to fix situations where a request will hang forever if the FSM rejects the next generated token. The original PR had a few issues which are addressed with these changes.
Borrowing from the original PR's problem description:
In this PR the solution is slightly different. When the FSM refuses to advance the request stopped and marked with new attribute
fsm_failed_to_advance. When the scheduler is cleaning up stopped requests is reads this attribute to make sure the stopped request is correctly handled. This helps ensure the end user's request is correctly aborted (instead of hanging forever) and preserve the scheduler's logic for handling stopped requestsTest Plan
Added a few specialized unit tests to confirm that when the FSM fails to advanced the request is correctly aborted.
Test Result
test_abort_request_when_structured_output_fsm_cannot_advance✅test_abort_request_when_structured_output_fsm_cannot_advance✅Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.