fix(graceful-eviction): retry on optimistic-lock conflicts when patching eviction tasks#7494
Conversation
…ing eviction tasks
The {RB,CRB}GracefulEvictionController periodically patches
Spec.GracefulEvictionTasks on bindings to remove completed or expired
tasks. The patch uses MergeFromWithOptimisticLock, which encodes the
cached metadata.resourceVersion and asks the apiserver to reject the
write if the object has changed. The taint manager's binding-eviction
worker writes the same field via Update with the same resourceVersion
guard. When the two collide between each other's read and write the
apiserver returns StatusReasonConflict, and there is no inline retry —
the only recovery is the controller's exponential workqueue backoff,
which starts at 5ms and grows to 1000s.
In a cluster-failover storm where the same binding cycles through both
controllers many times in close succession, the conflict tax dominates
wall-clock latency. A 5K-binding lab observed a 99.7% retry rate on the
resource-binding-graceful-eviction-controller workqueue (1072 retries
across 1075 adds), with workers spending most of their time asleep in
backoff rather than doing real work.
This change wraps the read-modify-patch in retry.RetryOnConflict for
both the namespaced and cluster-scoped variants. The signature of
syncBinding shifts from `(ctx, *Binding)` to
`(ctx, types.NamespacedName)` because the binding object now lives
inside the retry closure: each attempt does a fresh Get so the next
patch's MergeFromWithOptimisticLock base carries the latest
resourceVersion. Reconcile is simplified accordingly — the apierrors.IsNotFound
short-circuit moves up to wrap the syncBinding call.
The eviction event is emitted only on the successful attempt
(EmitClusterEvictionEventForResourceBinding / ...ForClusterResourceBinding
are now called with a nil error, since by the time we reach the loop
the patch has succeeded).
A dedicated test
(Test{RB,CRB}GracefulEvictionController_syncBinding_retryOnConflict)
uses an interceptor.Funcs to inject two consecutive Conflict responses
on Patch and asserts that the expired eviction task is still removed
once the wrapper recovers. A companion test
(Test{RB,CRB}GracefulEvictionController_syncBinding_notFoundSurfacedToReconcile)
verifies that NotFound is surfaced from the closure so that Reconcile
can short-circuit, since the closure no longer returns a no-op early.
The pre-existing TestCRBGracefulEvictionController_syncBinding tolerance
on the wall-clock-derived retryAfter value is loosened from 100ms to 1s.
The original tolerance was already brittle: it compared a `metav1.Now()`
captured at test setup against one captured inside the controller after
fake-client setup work. The new Get inside the retry closure pushed the
gap past 100ms on slower hosts. 1s is still well within the meaning of
the assertion, which is "the controller computed roughly 3 minutes."
Signed-off-by: Ryan Copley <ryan@ryancopley.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a performance bottleneck in the graceful-eviction controllers caused by optimistic-lock conflicts. Previously, collisions with the taint-manager resulted in frequent workqueue backoffs, significantly increasing latency during cluster-failover events. By implementing inline retries for patch operations, the controllers can now resolve these conflicts immediately, drastically improving throughput and stability. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR addresses optimistic-lock conflict storms between the taint-manager eviction worker and the {RB,CRB}GracefulEvictionController when both write Spec.GracefulEvictionTasks on the same binding objects. It improves eviction-task cleanup throughput under heavy churn by retrying conflicts inline rather than relying on the controller-runtime workqueue’s exponential backoff.
Changes:
- Wrap the read/modify/patch flow in
retry.RetryOnConflictfor both ResourceBinding and ClusterResourceBinding graceful-eviction controllers. - Refactor
syncBindingto accept atypes.NamespacedNameso each retry attempt performs a freshGet(ensuring the optimistic-lock base has an up-to-dateresourceVersion). - Add unit tests that inject
Conflicterrors onPatchand assert successful eventual cleanup; adjust an existing CRB retry-after tolerance to reduce wall-clock brittleness.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| pkg/controllers/gracefuleviction/rb_graceful_eviction_controller.go | Refactors RB graceful-eviction sync to refetch+patch under RetryOnConflict and simplifies Reconcile accordingly. |
| pkg/controllers/gracefuleviction/crb_graceful_eviction_controller.go | Same conflict-retry refactor for CRB graceful-eviction sync path. |
| pkg/controllers/gracefuleviction/rb_graceful_eviction_controller_test.go | Updates existing tests for new syncBinding signature and adds conflict/NotFound-focused regression tests. |
| pkg/controllers/gracefuleviction/crb_graceful_eviction_controller_test.go | Updates existing tests for new signature, loosens retry-after tolerance, and adds conflict/NotFound regression tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request refactors the graceful eviction controllers for ClusterResourceBinding and ResourceBinding to use retry.RetryOnConflict when patching eviction tasks. This ensures that optimistic lock conflicts result in an immediate refetch and retry, improving the efficiency of concurrent updates. Unit tests were expanded to cover conflict scenarios and timing-sensitive assertions. The reviewer suggested a defensive programming improvement to reset state variables at the beginning of the retry closure to ensure robustness against future code modifications.
| evictedClusters = evictedCluster | ||
| bindingForEvent = modifiedObj | ||
| retryAfter = nextRetry(keptTask, c.GracefulEvictionTimeout, metav1.Now().Time) |
There was a problem hiding this comment.
The variables evictedClusters, bindingForEvent, and retryAfter are updated inside the retry closure. While they are correctly reset in early return paths (lines 91-93 and 105-106), ensure that if a Patch call fails with a conflict and is retried, these variables are not left with stale data from the failed attempt. Currently, they are only assigned after a successful Patch (lines 116-118), which is correct. However, for better clarity and to avoid potential issues if the logic grows, consider resetting them at the very beginning of the closure.
| evictedClusters = evictedCluster | ||
| bindingForEvent = modifiedObj | ||
| retryAfter = nextRetry(keptTask, c.GracefulEvictionTimeout, metav1.Now().Time) |
There was a problem hiding this comment.
Similar to the cluster-scoped controller, the variables evictedClusters, bindingForEvent, and retryAfter are updated inside the retry closure. They are assigned after a successful Patch (lines 117-119), which ensures that only the results of the successful attempt are used for event emission. Consider resetting these variables at the start of the closure to prevent any risk of using stale data from a previous failed attempt in case the logic is modified in the future.
What type of PR is this?
/kind bug
What this PR does / why we need it:
The
{RB,CRB}GracefulEvictionControllerperiodically patchesSpec.GracefulEvictionTaskson bindings to remove completed or expired tasks. The patch usesMergeFromWithOptimisticLock, which encodes the cachedmetadata.resourceVersionand asks the apiserver to reject the write if the object has changed. The taint manager's binding-eviction worker writes the same field viaUpdatewith the sameresourceVersionguard. When the two collide between each other's read and write the apiserver returnsStatusReasonConflict, and there is no inline retry — the only recovery is the controller's exponential workqueue backoff, which starts at 5ms and grows to 1000s.In a cluster-failover storm where the same binding cycles through both controllers many times in close succession, the conflict tax dominates wall-clock latency. A 5K-binding lab observed a 99.7% retry rate on the
resource-binding-graceful-eviction-controllerworkqueue (1072 retries across 1075 adds), with workers spending most of their time asleep in backoff rather than doing real work.This change wraps the read-modify-patch in
retry.RetryOnConflictfor both the namespaced and cluster-scoped variants. The signature ofsyncBindingshifts from(ctx, *Binding)to(ctx, types.NamespacedName)because the binding object now lives inside the retry closure: each attempt does a freshGetso the next patch'sMergeFromWithOptimisticLockbase carries the latestresourceVersion.Reconcileis simplified accordingly — theapierrors.IsNotFoundshort-circuit moves up to wrap thesyncBindingcall.The eviction event is emitted only on the successful attempt (
EmitClusterEvictionEventForResourceBinding/…ForClusterResourceBindingare now called with a nil error, since by the time we reach the loop the patch has succeeded).Which issue(s) this PR fixes:
Fixes #7483
Special notes for your reviewer:
This is the graceful-eviction-controller half of a conflict pair; the taint-manager half is in a sibling PR (#TODO link to PR1 once filed). Either PR is independently correct, but the throughput improvement only materializes when both ship — they are the two writers that collide on
Spec.GracefulEvictionTasks. A third PR makes the eviction worker pool size configurable.Tests:
TestRBGracefulEvictionController_syncBinding_retryOnConflictand the…CRB…twin useinterceptor.Funcsto inject two consecutiveConflictresponses onPatchand assert the expired eviction task is still removed once the wrapper recovers.TestRBGracefulEvictionController_syncBinding_notFoundSurfacedToReconcileand the…CRB…twin verify thatNotFoundis surfaced from the closure so thatReconcilecan short-circuit, since the closure no longer returns a no-op early.go test -race ./pkg/controllers/gracefuleviction/...is clean.One heads-up: the pre-existing
TestCRBGracefulEvictionController_syncBindingtolerance on the wall-clock-derivedretryAftervalue is loosened from100msto1s. The original tolerance was already brittle — it compared ametav1.Now()captured at test setup against one captured inside the controller after fake-client setup work. The newGetinside the retry closure pushed the gap past100mson slower hosts.1sis still well within the meaning of the assertion, which is "the controller computed roughly 3 minutes." Happy to factor this differently (e.g. inject a clock) if reviewers prefer.Local benchmark on a 5K-binding kind lab failing one cluster, with both this PR and the taint-manager PR applied: sustained ~72 RB/s, peak 108 RB/s (vs. ~1.2 RB/s on stock controller-manager defaults). Methodology and raw CSV available on request.
Does this PR introduce a user-facing change?: