-
Notifications
You must be signed in to change notification settings - Fork 4.3k
VPA: Implement in-place updates support #8115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VPA: Implement in-place updates support #8115
Conversation
This adds the UpdateModeInPlaceOrRecreate mode to the types so we can use it. Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Allows you to specify an env var FEATURE_GATES which adds feature gates to all vpa components during vpa-up and e2e tests. Also allows local e2e tests to run kind with a new kind-config file which enables KEP-1287 InPlacePodVerticalScaling feature gate. Separates the admission-controller service into a separate deploy manifest. Signed-off-by: Max Cao <[email protected]>
Only allow VPA objects with InPlaceOrRecreate update mode to be created if InPlaceOrRecreate feature gate is enabled. If a VPA object already exists with this mode on, and the feature gate is disabled, this prevents further objects to be created with InPlaceOrRecreate, but this does not prevent the existing InPlaceOrRecreate VPA objects with from being modified. Signed-off-by: Max Cao <[email protected]>
We might want to add a few more that are combined disruption counters, e.g. in-place + eviction totals, but for now just add some separate counters to keep track of what in-place updates are doing.
Introduces large changes in the updater component to allow InPlaceOrRecreate mode. If the feature gate is enabled and the VPA update mode is InPlaceOrRecreate, the updater will attempt an in place update by first checking a number of preconditions before actuation (e.g., if the pod's qosClass would be changed, whether we are already in-place resizing, whether an in-place update may potentially violate disruption(previously eviction) tolerance, etc.). After the preconditions are validated, we send an update signal to the InPlacePodVerticalScaling API with the recommendation, which may or may not fail. Failures are handled in subsequent updater loops. As for implementation details, patchCalculators have been re-used from the admission-controllers code for the updater in order to calculate recommendations for the updater to actuate. InPlace logic has been mostly stuffed in the eviction package for now because of similarities and ease (user-initated API calls eviction vs. in-place; both cause disruption). It may or may not be useful to refactor this later. Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
The script needs to also check if the yaml input is a Deployment, and no longer needs to check for vpa-component names. Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
This commit refactors inplace logic outside of the pods eviction restriction and separates them into their own files. Also this commit adds PatchResourceTarget to calculators to allow them to explictly specify to the caller which resource/subresource they should be patched to. This commit also creates a utils subpackage in order to prevent dependency cycles in the unit tests, and adds various unit tests. Lastly, this commit adds a rateLimiter specifically for limiting inPlaceResize API calls. Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Omer Aplatony <[email protected]>
Co-authored-by: Adrian Moisey <[email protected]>
Signed-off-by: Omer Aplatony <[email protected]>
Co-authored-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
This commit refactors the VPA e2e test ginkgo wrappers so that they we can easily supply ginkgo decorators. This allows us to add ginkgo v2 labels to suites so that later we can run tests that only run FG tests. For now, this would only be useful for FG:InPlaceOrRecreate Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
5d67d4d to
3039f3c
Compare
|
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
|
so currently the e2e tests for the in-place feature are triggered every time right? |
|
They are not running because the feature gates are not automatically being applied through the CI. |
I think they should run by default for now. We can figure out how to run them dynamically later. Since our main focus is the in-place feature, we need the e2e tests to run. |
The Kubernetes feature-gate is enabled by default, and we could enable the VPA featute-gate in the CI tests |
raywainman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I'm assuming we are okay with enabling it after merging this then. I can attest to the e2e tests passing, but there's a small chance that it won't on the kube-infra. |
Fine with me. |
|
I'm fine with that too since this functionality is all off by default. I reviewed this with a keen eye for the "default happy path" where in-place updates is disabled and it all LGTM. It would be great to have folks play with this now that we have this in the main branch to make sure we didn't miss anything. /approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: maxcao13, raywainman The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Agreed! /approve Thanks Max for getting this done! |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
This PR is the initial alpha attempt to implement VPA in-place vertical scaling/in-place resize according to AEP-4016. It uses the VPA updater to actuate recommendations by sending resize patch requests to pods which allows automatic in-place resize as enabled by the
InPlacePodVerticalScalingfeature flag in k8s 1.27.0 alpha/1.33 beta and above (and by eventual graduation).Also introduces feature-gates to VPA, and includes the first feature-gate
InPlaceOrRecreatewhich allows the use ofInPlaceOrRecreateupdate mode.This PR is the amalgamation of these following PRs that were merged into the in-place-updates branch.
InPlaceOrRecreatefeature gate #7932and a few extra small commits.
This PR is based on this large PR that was broken up into the above pieces: #7673 which was the continuation of #6652 started by @jkyros.
Which issue(s) this PR fixes:
Fixes #4016
Special notes for your reviewer:
Most of the reviews happened in the separate PRs linked above. This PR should serve as the final sanity check to be able to merge to master, and to be able to resolve any merge conflicts that accrued from the time the feature branch was created, to now.
However, there are some new commits that were made in order to get this branch up to date with master. They are the ones including and after this commit: 0fa50f3
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: