Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preemption admission check controller. KEP update. #1178

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion keps/993-two-phase-admission/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Preemption Admission Check Controller](#preemption-admission-check-controller)
- [Test Plan](#test-plan)
- [Prerequisite testing updates](#prerequisite-testing-updates)
- [Unit Tests](#unit-tests)
Expand Down Expand Up @@ -206,7 +207,7 @@ pass AND there are some AdmissionChecks configured, AND the
workload is not in on-hold retry state from some check, it will:

1. Fill the Admission field in workload, with the desired flavor assignment.
2. Not do any preemptions yet (unless BookCapacity is set to true).
2. Not do any preemptions yet (unless the check uses `Anytime` preemption policy).
3. Set "QuotaReserved" to true.

Kueue will only pass as many pods into "QuotaReserved" as there would
Expand Down Expand Up @@ -253,6 +254,27 @@ The controller implementing a particular check should:
* After approving the workload, keep an eye on the check if it starts failing,
fail the check and cause the workload to move back to the suspended state.

### Preemption Admission Check Controller

In this proposal, the time to evict the preemption candidates varies based on the preemptor state
the scheduler will not issue the eviction during it's process instead it will set a `Pending` admission check
that is manged by a new built-in admission check controller.

The **Preemption Admission Check Controller** will:

- Watch for a change in state of the workloads pending preemption.
- Watch for workloads that are finishing execution and therefore releasing quota.
- Watch for AdmissionCheck changes, since the preemption policy can change.

The preemption controller uses the kueue cache, since it needs to check the state of workloads admitted to the ClusterQueues.

At every run the controller will get the list of workloads pending preemption.
1. Since for some of these workloads is not necessary to issue eviction at that given point (eg. Having a pending check that uses AfterCheckPassedOrOnDemand policy) their quota reservation will be ignored.
2. For every other preemption pending workloads, it will check if it can fit without evicting other workloads, case in which the preemption admission check condition will be set to `Ready`.
3. If eviction of other workload is still needed, an updated list candidates is created and eviction is issued for all of them.
4. If the updated list of candidates is empty, meaning that the preemption can no longer succeed, the preemption admission check is set as `Retry`, the workload will lose it's quota reservation and be requeued.


### Test Plan

[ x ] I/we understand the owners of the involved components may require updates to
Expand Down