Skip to content

Conversation

@sohankunkerkar
Copy link
Member

@sohankunkerkar sohankunkerkar commented Nov 12, 2025

Implements delayed retry mechanism for admission checks to prevent overwhelming external controllers and reduce control plane churn.

Problem:
Previously, when admission checks transitioned to Retry state, Kueue would immediately evict workloads and requeue them, causing excessive load on admission check controllers and unnecessary API server churn, particularly when retry conditions persisted predictably.

Solution:
This implementation adds two new fields to AdmissionCheckState:

  • requeueAfterSeconds: Specifies minimum wait time before retry
  • retryCount: Tracks retry attempts per admission check

Key Changes:

API Changes

  • Added requeueAfterSeconds and retryCount fields to AdmissionCheckState
  • Added +kubebuilder:validation:Minimum=0 to both fields (v1beta1 and v1beta2)

Controller Changes

  • Auto-increments retryCount on transition to Retry state
  • Calculates maximum retry time across all admission checks
  • Updates workload.status.requeueState.requeueAt with the maximum delay
  • Workload controller respects delayed retry times
  • RequeueAfterSeconds values persist across evictions
  • Refactored backoff calculation to use wait.NewBackoff() pattern

Behavior:
When multiple admission checks specify different delays, Kueue uses the maximum delay across all checks. Workloads are evicted immediately to release quota, but admission checks maintain their requeueAfterSeconds values, preventing race conditions where fast-responding checks could block slower ones from registering their delays.

Refs: KEP-3258
https://github.com/kubernetes-sigs/kueue/blob/main/keps/3258-delayed-admission-check-retries/README.md

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #3258

Special notes for your reviewer:

After discussing with @mimowo, taking over #7370 as the original author is on PTO.

Does this PR introduce a user-facing change?

AdmissionChecks: introduce new optional fields in the workload status for admission checks to control the delay by 
external and internal admission check controllers:
* requeueAfterSeconds: specifies minimum wait time before retry
* retryCount: Tracks retry attempts per admission check

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 12, 2025
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 12, 2025
@netlify
Copy link

netlify bot commented Nov 12, 2025

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 5a7d9a4
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/6918c5c31e1de600082b5d4d

@sohankunkerkar sohankunkerkar changed the title feat(KEP-3258): implement delayed admission check retries [WIP] feat(KEP-3258): implement delayed admission check retries Nov 12, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 12, 2025
@sohankunkerkar sohankunkerkar force-pushed the pr-7370 branch 10 times, most recently from 5cd940c to 5b6df07 Compare November 13, 2025 15:45
@sohankunkerkar sohankunkerkar changed the title [WIP] feat(KEP-3258): implement delayed admission check retries feat(KEP-3258): implement delayed admission check retries Nov 13, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 13, 2025
Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, thank you 👍

@sohankunkerkar sohankunkerkar force-pushed the pr-7370 branch 3 times, most recently from eb9934d to 1cd7565 Compare November 14, 2025 00:09
Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, left a couple comments, once this is done please follow up with refactoring of ProvisioningRequest to use the new API, this is already started by @dhenkel92 here: https://github.com/kubernetes-sigs/kueue/pull/7464/files#diff-d5b88e9a8af6b97ce61788f1307ec9ba1f4e3581a9c0634a151ce310c9ca3d91

if wl.Status.RequeueState != nil {
wl.Status.RequeueState.RequeueAt = nil
}
return &wl, true, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we can also add workload.SetRequeuedCondition(&wl, kueue.WorkloadBackoffFinished, "The workload backoff was finished", true)

It is probably a follow up for later, but I would like to consolidate the two requests (here & pre-existing below) into one. They are really the same thing conceptually.

Still, this would mean a bigger diff so I'm ok as is for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also extend one of the test asserting to check that the Requeued=true condition is set.

@sohankunkerkar sohankunkerkar force-pushed the pr-7370 branch 2 times, most recently from 540f431 to 92e8f4b Compare November 14, 2025 15:10
if wl.Status.RequeueState != nil {
wl.Status.RequeueState = nil
if wl.Status.RequeueState != nil && wl.Status.RequeueState.RequeueAt != nil {
// Clear the requeue schedule while preserving the Count for historical tracking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also want to clear the Count as before. This is because we re-activation is a manual process in spec, and if we don't clear the Count, then such a workload will be automatically put back as Deactivated, which is not good experience. So the user would need to clear the Count on their own.

This makes me think we should also clear admissionChecks.RetryCount fields on deactivation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch! Just to sum up here-> we need to clear: RequeueState.Count, RequeueState.RequeueAt, all AdmissionChecks[].RetryCount fields, and all AdmissionChecks[].RequeueAfterSeconds fields.
Having said that, we can't set RequeueState = nil because that causes the validation error with SSA. We need to
clear the fields individually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having said that, we can't set RequeueState = nil because that causes the validation error with SSA. We need to
clear the fields individually.

Hm, ok, but this is surprising because the pre-existing code would do

if wl.Status.RequeueState != nil {
   wl.Status.RequeueState = nil
}

didn't work on the current releases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mimowo Yeah, I made minimal changes to make this work. Let me know how you feel about this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok. Could you please also prepare a small PR with just this fix, and dedicated test.
I think if this didn't work in the past it is something where we should cherrypick the fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, actually after you modified the ProvisioningRequest controller the fix is no longer needed. I think the fix was needed because the ProvRequest controller was changing the RequeueState using another admission manager.

@sohankunkerkar sohankunkerkar force-pushed the pr-7370 branch 8 times, most recently from 51b4465 to 0f11ac7 Compare November 15, 2025 18:24
sohankunkerkar and others added 2 commits November 15, 2025 13:25
Implements delayed retry mechanism for admission checks to prevent overwhelming
external controllers and reduce control plane churn.

Problem:
Previously, when admission checks transitioned to Retry state, Kueue would
immediately evict workloads and requeue them, causing excessive load on
admission check controllers and unnecessary API server churn, particularly
when retry conditions persisted predictably.

Solution:
This implementation adds two new fields to AdmissionCheckState:
- requeueAfterSeconds: Specifies minimum wait time before retry
- retryCount: Tracks retry attempts per admission check

Key Changes:

API Changes
- Added requeueAfterSeconds and retryCount fields to AdmissionCheckState
- Added +kubebuilder:validation:Minimum=0 to both fields (v1beta1 and v1beta2)

Controller Changes
- Auto-increments retryCount on transition to Retry state
- Calculates maximum retry time across all admission checks
- Updates workload.status.requeueState.requeueAt with the maximum delay
- Workload controller respects delayed retry times
- RequeueAfterSeconds values persist across evictions
- Refactored backoff calculation to use wait.NewBackoff() pattern

Behavior:
When multiple admission checks specify different delays, Kueue uses the maximum
delay across all checks. Workloads are evicted immediately to release quota, but
admission checks maintain their requeueAfterSeconds values, preventing race
conditions where fast-responding checks could block slower ones from registering
their delays.

Refs: KEP-3258
https://github.com/DataDog/kueue/blob/main/keps/3258-delayed-admission-check-retries/README.md

Signed-off-by: Sohan Kunkerkar <[email protected]>
Co-authored-by: Daniel Henkel <[email protected]>
This change replaces the use of the shared RequeueState field
with the new mechanism in the preprovision request admission check.

Signed-off-by: Sohan Kunkerkar <[email protected]>
Co-authored-by: Daniel Henkel <[email protected]>
backoff := wait.NewBackoff(time.Duration(backoffBaseSeconds)*time.Second, time.Duration(backoffMaxSeconds)*time.Second, 2, 0.0001)
waitDuration := backoff.WaitTime(int(requeuingCount))

acState.RequeueAfterSeconds = ptr.To(int32(waitDuration.Truncate(time.Second).Seconds()))
Copy link
Contributor

@mimowo mimowo Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
Please also address the small follow ups:

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 17, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: 8081cc7c81df3b68b2123ffc41c4a3c9b508d39d

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mimowo, sohankunkerkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 17, 2025
@k8s-ci-robot k8s-ci-robot merged commit 57f960e into kubernetes-sigs:main Nov 17, 2025
22 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.15 milestone Nov 17, 2025
Singularity23x0 pushed a commit to Singularity23x0/kueue that referenced this pull request Nov 17, 2025
…-sigs#7620)

* feat(KEP-3258): implement delayed admission check retries

Implements delayed retry mechanism for admission checks to prevent overwhelming
external controllers and reduce control plane churn.

Problem:
Previously, when admission checks transitioned to Retry state, Kueue would
immediately evict workloads and requeue them, causing excessive load on
admission check controllers and unnecessary API server churn, particularly
when retry conditions persisted predictably.

Solution:
This implementation adds two new fields to AdmissionCheckState:
- requeueAfterSeconds: Specifies minimum wait time before retry
- retryCount: Tracks retry attempts per admission check

Key Changes:

API Changes
- Added requeueAfterSeconds and retryCount fields to AdmissionCheckState
- Added +kubebuilder:validation:Minimum=0 to both fields (v1beta1 and v1beta2)

Controller Changes
- Auto-increments retryCount on transition to Retry state
- Calculates maximum retry time across all admission checks
- Updates workload.status.requeueState.requeueAt with the maximum delay
- Workload controller respects delayed retry times
- RequeueAfterSeconds values persist across evictions
- Refactored backoff calculation to use wait.NewBackoff() pattern

Behavior:
When multiple admission checks specify different delays, Kueue uses the maximum
delay across all checks. Workloads are evicted immediately to release quota, but
admission checks maintain their requeueAfterSeconds values, preventing race
conditions where fast-responding checks could block slower ones from registering
their delays.

Refs: KEP-3258
https://github.com/DataDog/kueue/blob/main/keps/3258-delayed-admission-check-retries/README.md

Signed-off-by: Sohan Kunkerkar <[email protected]>
Co-authored-by: Daniel Henkel <[email protected]>

* pkg: use delayed ACs in preprovisioning check

This change replaces the use of the shared RequeueState field
with the new mechanism in the preprovision request admission check.

Signed-off-by: Sohan Kunkerkar <[email protected]>
Co-authored-by: Daniel Henkel <[email protected]>

---------

Signed-off-by: Sohan Kunkerkar <[email protected]>
Co-authored-by: Daniel Henkel <[email protected]>
@mimowo
Copy link
Contributor

mimowo commented Nov 28, 2025

/remove-kind api-change
/kind feature

/release-note-edit

AdmissionChecks: introduce new optional fields in the workload status for admission checks to control the delay by 
external and internal admission check controllers:
* requeueAfterSeconds: specifies minimum wait time before retry
* retryCount: Tracks retry attempts per admission check

@k8s-ci-robot k8s-ci-robot removed the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add retry mechanism for AdmissionChecks in Kueue

3 participants