☂️ Gardener Horizontal & Vertical Pod Autoscaler, a.k.a. HVPA (v2)

**Feature (What you would like to be added):**
Summarise the roadmap for HVPA with links to the corresponding issues.

**Motivation (Why is this needed?):**
A central place to collect the roadmap as well as the progress.

**Approach/Hint to the implement solution (optional):**

##### General Principles

* The goal of HVPA is to re-use the upstream components [HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) and [VPA](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler) as much as possible for scaling components horizontally and vertically respectively.
  * HPA for recommendation and scaling horizontally.
  * VPA for recommendation for scaling vertically.
* Where there are gaps in using HPA and VPA simultaneously to scale a given component, introduce functionality to fill those gaps.
  * HPA and VPA are recommended to be mixed only in case HPA is used for custom/external metrics as mentioned [here](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler#limitations-of-beta-version). But in some scenarios it might make sense to use them both even for CPU and memory (e.g. `kube-apiserver` or ingress etc.)
  * VPA updates the pods directly (via webhooks) wheras HPA scales the upstream`targetRefs`. The VPA approach duplicates/overrides the update mechanism (such as rolling updates etc.) that the upstream `targetRefs` might have implemented.
* Where there is functionality missing in either HPA or VPA, introduce them to provide more flexibility during horizontal and vertical scaling. Especially, for the components that experience disruption during scaling.
  * Weight-based scaling horizontally and vertically simultaneously.
  * Support for configurable (at the HVPA resource level) threshold levels to trigger VPA (and possibly HPA) updates to minimise unnecessary scaling of components. Especially, if scaling is disruptive.
 * Support for configurable (at the HVPA resource level) stabilisation window in all the four directions (up/down/out/in) to stabilise scaling of components. Especially, if scaling is disruptive.
 * Support for configurable maintenance window (at the HVPA resource level) for scaling (especially, scaling in/down) for components that do not scale well smoothly (mainly, `etcd`, but to a lesser extent `kube-apiserver` as well for `WATCH` requests). This could be as an alternative or complementary to the stabilisation window mentioned above.
 * Support for flexible update policy for all four scaling directions (`Off`/`Auto`/`ScaleUp`). `ScaleUp` would only apply scale up and not scale down (vertically or horizontally). This is again from the perspective of components which experience disruption while scaling (mainly, `etcd`, but to a lesser extent `kube-apiserver` as well for `WATCH` requests). For such components, a `ScaleUp` update policy will ensure that the component can scale up (with some disruption) automatically to meet the workload requirement but not scale down to avoid unnecessary disruption. This would mean over-provisioning for workloads that experience a short upsurge.
  * Alerts when some percentage threshold of the maxAllowed is reached for requests of any container in the `targetRef`.
  * Support for custom resources as `targetRefs`.

##### Tasks

- [X] HVPA custom resource to include templates for HPA and VPA.
- [X] Controller logic to deploy and reconcile HPA and VPA resources based on the templates in the HVPA spec.
- [X] Controller logic to adopt pre-existing HPA and VPA resources if they match the selectors in the HVPA spec.
- [X] `Auto` update policy for HPA updates. HPA takes care of both recommendation and updates for horizontal scaling. This implementation of `Auto` update policy is temporary pending #7.
- [X] `Off`, `Auto` and `ScaleUp` update policy for VPA updates. VPA is used only for recommendation and not for updates. Fixed with #19.
- [X] `Off` update policy for HPA updates. Implemented by not deploying/deleting HPA resource. This implementation if `Off` update policy is temporary temporary pending #7.
- [X] Weight-based scaling for VPA updates with any value between `0` and `100` for VPA weight. Fixed with #19.
- [X] Weight-based scaling for HPA updates with values `0` or `100` for HPA weight.
- [ ] Update proposal/documentation to be in sync with feature and behaviour changes. This is an on-going task.
- [ ] Consolidate and keep up to date FAQs/Recommended Actions documents as a first point of reference to operators/admin. This is an on-going task.
- [x] Release HVPA implemented so far in different landscapes to gain experience. Prio 1.
- [x] Enable `Auto` update policy (i.e. enable scale down) for `kube-server` to reduce cost implication. `ScaleUp` update policy would continue for etcd for the time being because it could be disruptive. Prio 1.
- [x] If an `OOMKill` or CPU overload happens, override stabilisation window as well as HPA weight to apply the weighted VPA recommendation. Prio 1.
- [x] Auto scale limits to be in sync with scaling of requests. Prio 1.
- [ ] Unit tests and Integration tests (using Test Machinery). Prio 2.
- [ ] Alerts when some percentage threshold of the maxAllowed is reached for requests of any container in the `targetRef`. Prio 2.
- [x] Scale down during a maintenance window. This would be used for components that experience disruption during scaling. Prio 2.
- [ ] Implement and use the `Scale` subresource in HVPA CRD to control HPA update fully and us HPA only for recommendation. Pending #7. Prio 3.
   - [ ] Implement `ScaleUp` update policy for HPA updates. Prio 3.
   - [ ] Change the `Off` update policy implementation for HPA to deploy/reconcile HPA resource even in the `Off` mode. Retain the recommendations but block the updates. Prio 3.
   - [ ] Weight-based scaling for HPA updates with any value between `0` to `100` as weight for HPA. Prio 3.
- [ ] Submit and drive adoption of KEP for `Resources` subresource (per container) along the lines of the `Scale` subresource. This can then be used to implement the support for custom resources as `targetRef`. Prio 4.
- [ ] Recovery/ramp-up of overloaded/crashing `targetRef`. Prio 5.
- [ ] Pro-actively throttle/ramp-down soon-to-be overloaded `targetRef` to avoid crash. Prio 5.
- [ ] Support for custom resources as `targetRef`. If the KEP for `Resources` subresources is not yet accepted, then this could be implemented using annotations to supply the desired metadata. Prio 6.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

☂️ Gardener Horizontal & Vertical Pod Autoscaler, a.k.a. HVPA (v2) #30

General Principles

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

☂️ Gardener Horizontal & Vertical Pod Autoscaler, a.k.a. HVPA (v2) #30

Description

General Principles

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions