[iris] Controller-side grace window for StartTasks/PollTasks race#5054
Closed
[iris] Controller-side grace window for StartTasks/PollTasks race#5054
Conversation
Contributor
|
Claude finished @rjpower's task in 4m 44s —— View job Code review
No issues found. Checked for bugs and Notes from the analysis (informational, not blocking):
|
Contributor
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
ravwojdyla
approved these changes
Apr 22, 2026
| # PollTasks race: a poll's DB snapshot taken before the assignment commit | ||
| # would otherwise omit the task, and the worker would kill it as | ||
| # unexpected. 30s comfortably exceeds any normal StartTasks RPC latency. | ||
| _RECENT_DISPATCH_GRACE_SECONDS = 30.0 |
Contributor
There was a problem hiding this comment.
I'm scared of these magic numbers - but it makes sense!
Collaborator
Author
There was a problem hiding this comment.
yeah I want to get rid of this soon, we'll switch to a more sensible model without the race
Collaborator
Author
|
🤖 Superseded by #5090, which fixes the same race at the source by moving PollTasks inline in the scheduling loop (single thread now owns both the assignment commit and the running-tasks snapshot). The |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a PollTasks RPC's DB snapshot is taken before a concurrent StartTasks' assignment commit, the poll omits the new task from expected_tasks and the worker kills it as unexpected (or reports WORKER_FAILED for a task it has not yet received). Track recent StartTasks dispatches in-memory for 30s, merge them into the per-worker expected set, and drop any updates for tasks still inside the grace window so the next poll settles them cleanly. Complements #5043.