Skip to content

[Core][Feat][ safely abort requests where FSM failed to advance#38663

Open
walterbm wants to merge 4 commits intovllm-project:mainfrom
walterbm:walter/safely-terminate-requests-where-fsm-failed
Open

[Core][Feat][ safely abort requests where FSM failed to advance#38663
walterbm wants to merge 4 commits intovllm-project:mainfrom
walterbm:walter/safely-terminate-requests-where-fsm-failed

Conversation

@walterbm
Copy link
Copy Markdown
Contributor

@walterbm walterbm commented Mar 31, 2026

Purpose

Revive #18780 which attempted to fix situations where a request will hang forever if the FSM rejects the next generated token. The original PR had a few issues which are addressed with these changes.

Borrowing from the original PR's problem description:

When using structured outputs with the xgrammar backend, streaming requests would hang indefinitely if the FSM (Finite State Machine) failed to advance. This occurred when accept_tokens() returned False in the xgrammar backend, logging an error but not properly terminating the request.

In this PR the solution is slightly different. When the FSM refuses to advance the request stopped and marked with new attribute fsm_failed_to_advance. When the scheduler is cleaning up stopped requests is reads this attribute to make sure the stopped request is correctly handled. This helps ensure the end user's request is correctly aborted (instead of hanging forever) and preserve the scheduler's logic for handling stopped requests

Test Plan

Added a few specialized unit tests to confirm that when the FSM fails to advanced the request is correctly aborted.

Test Result

  • test_abort_request_when_structured_output_fsm_cannot_advance
  • test_abort_request_when_structured_output_fsm_cannot_advance

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: walterbm <walter.beller.morales@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements logic to abort requests when the structured output Finite State Machine (FSM) fails to advance. It introduces a new fsm_failed_to_advance flag in the Request class and updates the scheduler's update_from_output method to detect FSM failures, mark requests as aborted, and ensure they are not resumed. Comprehensive unit tests for both the synchronous and asynchronous schedulers have been added. The review feedback points out a minor redundancy in variable access within the scheduler logic.

Signed-off-by: walterbm <walter.beller.morales@gmail.com>
Signed-off-by: walterbm <walter.beller.morales@gmail.com>
Signed-off-by: walterbm <walter.beller.morales@gmail.com>
@walterbm walterbm changed the title [Feat][Core] safely abort requests where FSM failed to advance [Core][Feat][ safely abort requests where FSM failed to advance Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant