This repository was archived by the owner on Jan 13, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 23
This repository was archived by the owner on Jan 13, 2025. It is now read-only.
☂️ Gardener Horizontal & Vertical Pod Autoscaler, a.k.a. HVPA (v2) #30
Copy link
Copy link
Open
Labels
kind/enhancementEnhancement, improvement, extensionEnhancement, improvement, extensionkind/epicLarge multi-story topicLarge multi-story topickind/roadmapRoadmap BLIRoadmap BLIlifecycle/rottenNobody worked on this for 12 months (final aging stage)Nobody worked on this for 12 months (final aging stage)
Milestone
Description
Feature (What you would like to be added):
Summarise the roadmap for HVPA with links to the corresponding issues.
Motivation (Why is this needed?):
A central place to collect the roadmap as well as the progress.
Approach/Hint to the implement solution (optional):
General Principles
- The goal of HVPA is to re-use the upstream components HPA and VPA as much as possible for scaling components horizontally and vertically respectively.
- HPA for recommendation and scaling horizontally.
- VPA for recommendation for scaling vertically.
- Where there are gaps in using HPA and VPA simultaneously to scale a given component, introduce functionality to fill those gaps.
- HPA and VPA are recommended to be mixed only in case HPA is used for custom/external metrics as mentioned here. But in some scenarios it might make sense to use them both even for CPU and memory (e.g.
kube-apiserveror ingress etc.) - VPA updates the pods directly (via webhooks) wheras HPA scales the upstream
targetRefs. The VPA approach duplicates/overrides the update mechanism (such as rolling updates etc.) that the upstreamtargetRefsmight have implemented.
- HPA and VPA are recommended to be mixed only in case HPA is used for custom/external metrics as mentioned here. But in some scenarios it might make sense to use them both even for CPU and memory (e.g.
- Where there is functionality missing in either HPA or VPA, introduce them to provide more flexibility during horizontal and vertical scaling. Especially, for the components that experience disruption during scaling.
- Weight-based scaling horizontally and vertically simultaneously.
- Support for configurable (at the HVPA resource level) threshold levels to trigger VPA (and possibly HPA) updates to minimise unnecessary scaling of components. Especially, if scaling is disruptive.
- Support for configurable (at the HVPA resource level) stabilisation window in all the four directions (up/down/out/in) to stabilise scaling of components. Especially, if scaling is disruptive.
- Support for configurable maintenance window (at the HVPA resource level) for scaling (especially, scaling in/down) for components that do not scale well smoothly (mainly,
etcd, but to a lesser extentkube-apiserveras well forWATCHrequests). This could be as an alternative or complementary to the stabilisation window mentioned above. - Support for flexible update policy for all four scaling directions (
Off/Auto/ScaleUp).ScaleUpwould only apply scale up and not scale down (vertically or horizontally). This is again from the perspective of components which experience disruption while scaling (mainly,etcd, but to a lesser extentkube-apiserveras well forWATCHrequests). For such components, aScaleUpupdate policy will ensure that the component can scale up (with some disruption) automatically to meet the workload requirement but not scale down to avoid unnecessary disruption. This would mean over-provisioning for workloads that experience a short upsurge. - Alerts when some percentage threshold of the maxAllowed is reached for requests of any container in the
targetRef. - Support for custom resources as
targetRefs.
Tasks
- HVPA custom resource to include templates for HPA and VPA.
- Controller logic to deploy and reconcile HPA and VPA resources based on the templates in the HVPA spec.
- Controller logic to adopt pre-existing HPA and VPA resources if they match the selectors in the HVPA spec.
-
Autoupdate policy for HPA updates. HPA takes care of both recommendation and updates for horizontal scaling. This implementation ofAutoupdate policy is temporary pending Evaluate options for controlling HPA-based scaling #7. -
Off,AutoandScaleUpupdate policy for VPA updates. VPA is used only for recommendation and not for updates. Fixed with HVPA now supports UpdateMode "off" for HPA and VPA #19. -
Offupdate policy for HPA updates. Implemented by not deploying/deleting HPA resource. This implementation ifOffupdate policy is temporary temporary pending Evaluate options for controlling HPA-based scaling #7. - Weight-based scaling for VPA updates with any value between
0and100for VPA weight. Fixed with HVPA now supports UpdateMode "off" for HPA and VPA #19. - Weight-based scaling for HPA updates with values
0or100for HPA weight. - Update proposal/documentation to be in sync with feature and behaviour changes. This is an on-going task.
- Consolidate and keep up to date FAQs/Recommended Actions documents as a first point of reference to operators/admin. This is an on-going task.
- Release HVPA implemented so far in different landscapes to gain experience. Prio 1.
- Enable
Autoupdate policy (i.e. enable scale down) forkube-serverto reduce cost implication.ScaleUpupdate policy would continue for etcd for the time being because it could be disruptive. Prio 1. - If an
OOMKillor CPU overload happens, override stabilisation window as well as HPA weight to apply the weighted VPA recommendation. Prio 1. - Auto scale limits to be in sync with scaling of requests. Prio 1.
- Unit tests and Integration tests (using Test Machinery). Prio 2.
- Alerts when some percentage threshold of the maxAllowed is reached for requests of any container in the
targetRef. Prio 2. - Scale down during a maintenance window. This would be used for components that experience disruption during scaling. Prio 2.
- Implement and use the
Scalesubresource in HVPA CRD to control HPA update fully and us HPA only for recommendation. Pending Evaluate options for controlling HPA-based scaling #7. Prio 3.- Implement
ScaleUpupdate policy for HPA updates. Prio 3. - Change the
Offupdate policy implementation for HPA to deploy/reconcile HPA resource even in theOffmode. Retain the recommendations but block the updates. Prio 3. - Weight-based scaling for HPA updates with any value between
0to100as weight for HPA. Prio 3.
- Implement
- Submit and drive adoption of KEP for
Resourcessubresource (per container) along the lines of theScalesubresource. This can then be used to implement the support for custom resources astargetRef. Prio 4. - Recovery/ramp-up of overloaded/crashing
targetRef. Prio 5. - Pro-actively throttle/ramp-down soon-to-be overloaded
targetRefto avoid crash. Prio 5. - Support for custom resources as
targetRef. If the KEP forResourcessubresources is not yet accepted, then this could be implemented using annotations to supply the desired metadata. Prio 6.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
kind/enhancementEnhancement, improvement, extensionEnhancement, improvement, extensionkind/epicLarge multi-story topicLarge multi-story topickind/roadmapRoadmap BLIRoadmap BLIlifecycle/rottenNobody worked on this for 12 months (final aging stage)Nobody worked on this for 12 months (final aging stage)