Skip to content
This repository was archived by the owner on Jan 13, 2025. It is now read-only.
This repository was archived by the owner on Jan 13, 2025. It is now read-only.

☂️ Gardener Horizontal & Vertical Pod Autoscaler, a.k.a. HVPA (v2) #30

@amshuman-kr

Description

@amshuman-kr

Feature (What you would like to be added):
Summarise the roadmap for HVPA with links to the corresponding issues.

Motivation (Why is this needed?):
A central place to collect the roadmap as well as the progress.

Approach/Hint to the implement solution (optional):

General Principles
  • The goal of HVPA is to re-use the upstream components HPA and VPA as much as possible for scaling components horizontally and vertically respectively.
    • HPA for recommendation and scaling horizontally.
    • VPA for recommendation for scaling vertically.
  • Where there are gaps in using HPA and VPA simultaneously to scale a given component, introduce functionality to fill those gaps.
    • HPA and VPA are recommended to be mixed only in case HPA is used for custom/external metrics as mentioned here. But in some scenarios it might make sense to use them both even for CPU and memory (e.g. kube-apiserver or ingress etc.)
    • VPA updates the pods directly (via webhooks) wheras HPA scales the upstreamtargetRefs. The VPA approach duplicates/overrides the update mechanism (such as rolling updates etc.) that the upstream targetRefs might have implemented.
  • Where there is functionality missing in either HPA or VPA, introduce them to provide more flexibility during horizontal and vertical scaling. Especially, for the components that experience disruption during scaling.
    • Weight-based scaling horizontally and vertically simultaneously.
    • Support for configurable (at the HVPA resource level) threshold levels to trigger VPA (and possibly HPA) updates to minimise unnecessary scaling of components. Especially, if scaling is disruptive.
  • Support for configurable (at the HVPA resource level) stabilisation window in all the four directions (up/down/out/in) to stabilise scaling of components. Especially, if scaling is disruptive.
  • Support for configurable maintenance window (at the HVPA resource level) for scaling (especially, scaling in/down) for components that do not scale well smoothly (mainly, etcd, but to a lesser extent kube-apiserver as well for WATCH requests). This could be as an alternative or complementary to the stabilisation window mentioned above.
  • Support for flexible update policy for all four scaling directions (Off/Auto/ScaleUp). ScaleUp would only apply scale up and not scale down (vertically or horizontally). This is again from the perspective of components which experience disruption while scaling (mainly, etcd, but to a lesser extent kube-apiserver as well for WATCH requests). For such components, a ScaleUp update policy will ensure that the component can scale up (with some disruption) automatically to meet the workload requirement but not scale down to avoid unnecessary disruption. This would mean over-provisioning for workloads that experience a short upsurge.
  • Alerts when some percentage threshold of the maxAllowed is reached for requests of any container in the targetRef.
  • Support for custom resources as targetRefs.
Tasks
  • HVPA custom resource to include templates for HPA and VPA.
  • Controller logic to deploy and reconcile HPA and VPA resources based on the templates in the HVPA spec.
  • Controller logic to adopt pre-existing HPA and VPA resources if they match the selectors in the HVPA spec.
  • Auto update policy for HPA updates. HPA takes care of both recommendation and updates for horizontal scaling. This implementation of Auto update policy is temporary pending Evaluate options for controlling HPA-based scaling #7.
  • Off, Auto and ScaleUp update policy for VPA updates. VPA is used only for recommendation and not for updates. Fixed with HVPA now supports UpdateMode "off" for HPA and VPA #19.
  • Off update policy for HPA updates. Implemented by not deploying/deleting HPA resource. This implementation if Off update policy is temporary temporary pending Evaluate options for controlling HPA-based scaling #7.
  • Weight-based scaling for VPA updates with any value between 0 and 100 for VPA weight. Fixed with HVPA now supports UpdateMode "off" for HPA and VPA #19.
  • Weight-based scaling for HPA updates with values 0 or 100 for HPA weight.
  • Update proposal/documentation to be in sync with feature and behaviour changes. This is an on-going task.
  • Consolidate and keep up to date FAQs/Recommended Actions documents as a first point of reference to operators/admin. This is an on-going task.
  • Release HVPA implemented so far in different landscapes to gain experience. Prio 1.
  • Enable Auto update policy (i.e. enable scale down) for kube-server to reduce cost implication. ScaleUp update policy would continue for etcd for the time being because it could be disruptive. Prio 1.
  • If an OOMKill or CPU overload happens, override stabilisation window as well as HPA weight to apply the weighted VPA recommendation. Prio 1.
  • Auto scale limits to be in sync with scaling of requests. Prio 1.
  • Unit tests and Integration tests (using Test Machinery). Prio 2.
  • Alerts when some percentage threshold of the maxAllowed is reached for requests of any container in the targetRef. Prio 2.
  • Scale down during a maintenance window. This would be used for components that experience disruption during scaling. Prio 2.
  • Implement and use the Scale subresource in HVPA CRD to control HPA update fully and us HPA only for recommendation. Pending Evaluate options for controlling HPA-based scaling #7. Prio 3.
    • Implement ScaleUp update policy for HPA updates. Prio 3.
    • Change the Off update policy implementation for HPA to deploy/reconcile HPA resource even in the Off mode. Retain the recommendations but block the updates. Prio 3.
    • Weight-based scaling for HPA updates with any value between 0 to 100 as weight for HPA. Prio 3.
  • Submit and drive adoption of KEP for Resources subresource (per container) along the lines of the Scale subresource. This can then be used to implement the support for custom resources as targetRef. Prio 4.
  • Recovery/ramp-up of overloaded/crashing targetRef. Prio 5.
  • Pro-actively throttle/ramp-down soon-to-be overloaded targetRef to avoid crash. Prio 5.
  • Support for custom resources as targetRef. If the KEP for Resources subresources is not yet accepted, then this could be implemented using annotations to supply the desired metadata. Prio 6.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/enhancementEnhancement, improvement, extensionkind/epicLarge multi-story topickind/roadmapRoadmap BLIlifecycle/rottenNobody worked on this for 12 months (final aging stage)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions