-
Notifications
You must be signed in to change notification settings - Fork 684
Query Scheduler: Graceful Shutdown with Inflight and Pending requests #13603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… conditions, tests do not pass yet
…does what it says
…uest metric ticker
This reverts commit cb3a483.
|
💻 Deploy preview available (WIP Clean Scheduler Shutdown with Inflight Requests): |
|
💻 Deploy preview available (Clean Scheduler Shutdown with Inflight Requests): |
…sync.OnceFunc call
…r by just storing it directly with the len instead of trying to be smart
pkg/scheduler/queue/queue.go
Outdated
|
|
||
| level.Warn(q.log).Log( | ||
| "msg", "queue stop requested but query queue is not empty, waiting for query workers to complete remaining requests", | ||
| "queueBroker_count", q.queueBroker.itemCount(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick that log lines should be easy and obvious to read even if you're not seeing the code. So queueBroker_count -> queued_requests
but also isn't schedulerInflightRequests.Load() the same as queueBroker.itemCount()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduler inflight are the ones currently being handled by queriers that have been dequeued but not completed, we have to track them separately to ensure we don't cancel their contexts by closing the connections to the queriers before they're done.
|
|
||
| // This test ensures that the queue will wait for any pending tests to be dequeued and processed before exiting. | ||
| // This should be completed before the timeout. | ||
| func TestRequestQueue_ShutdownWithInflightRequests_ShouldDrainRequests(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this test the same as TestRequestQueue_ShutdownWithInflightSchedulerRequests_ShouldDrainRequests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this one doesn't involve scheduler inflight requests, the other one does.
tacole02
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs look good! I left a few minor suggestions. Thank you!
|
💻 Deploy preview available (Query Scheduler: Graceful Shutdown with Inflight and Pending requests): |
What this PR does
This PR implements graceful shutdown of the Query Scheduler, by waiting until all pending requests have been taken by queriers and then returned to the frontend. This process has a timeout provided by a new config option with a default of 30 seconds.
Which issue(s) this PR fixes or relates to
Fixes #12605
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]. If changelog entry is not needed, please add thechangelog-not-neededlabel to the PR.about-versioning.mdupdated with experimental features.Note
Gracefully drain pending and in-flight requests on scheduler shutdown, controlled by new
-query-scheduler.graceful-shutdown-timeout, with queue/scheduler logic, metrics, tests, and docs updated.queue.RequestQueue(stopRequested/stopCompleted, timeout, item counting viaitemCount()) and handle stop inAwaitRequestForQuerier.SHUTTING_DOWNresponses; refine loop exit/logging.graceful_shutdown_timeout/-query-scheduler.graceful-shutdown-timeout(default 2m15s) incmd/mimir/config-descriptor.json, help templates, docs, and defaults JSON.Written by Cursor Bugbot for commit 02617b9. This will update automatically on new commits. Configure here.