Skip to content

Fix callback cleanup on error in flow controllers#51

Merged
Randomizez merged 2 commits into
stepfun-ai:devfrom
yuruofeifei:fix/flow-callback-cleanup
Apr 28, 2026
Merged

Fix callback cleanup on error in flow controllers#51
Randomizez merged 2 commits into
stepfun-ai:devfrom
yuruofeifei:fix/flow-callback-cleanup

Conversation

@yuruofeifei

@yuruofeifei yuruofeifei commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Fix flow_callback in both SimpleFlowController and FullyAsyncFlowController: when generation errors occur, cleanup (active_counter -= 1 / running_genables.pop()) now always runs, preventing silent deadlocks.
  • Add 3 tests with ErrorGenerationController to verify no hang on generation failure.

Test plan

  • Pre-commit checks pass
  • test_fully_async_error_with_allow_errors_does_not_hang
  • test_fully_async_error_cleans_up_running_genables
  • test_simple_controller_error_with_allow_errors_does_not_hang

🤖 Generated with Claude Code

In SimpleFlowController, if flow_callback raises (genable_allow_errors=False),
active_counter was never decremented, causing _generation_worker to spin
forever at `while active_counter: time.sleep(1)`.

In FullyAsyncFlowController, the same raise path left the genable in
running_genables permanently, blocking staleness checks and potentially
deadlocking _wait_for_next_weight.

Fix: use try/finally (Simple) and try/except+re-raise (FullyAsync) to
ensure cleanup always runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yuruofeifei yuruofeifei requested a review from a team April 26, 2026 02:23
Add ErrorGenerationController that simulates generation failures.
Three tests verify the pipeline doesn't hang when errors occur:
- FullyAsync: error with allow_errors continues without hanging
- FullyAsync: running_genables cleaned up after error
- Simple: active_counter decremented after error

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@Randomizez Randomizez left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@Randomizez Randomizez merged commit 90612d0 into stepfun-ai:dev Apr 28, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants