Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
770b767
VPA: Add UpdateModeInPlaceOrRecreate to types
jkyros Mar 14, 2024
6f86a98
VPA: Introduce VPA feature gates; add InPlaceOrRecreate feature gate
maxcao13 Mar 7, 2025
eb15361
VPA: Allow deploying InPlaceOrRecreate in local e2e and ci
maxcao13 Mar 15, 2025
b37a3eb
VPA: Allow admission-controller to validate in-place spec
maxcao13 Mar 22, 2025
2af23c8
VPA: Add metrics gauges for in-place updates
jkyros Mar 14, 2024
6ebeb83
VPA: Allow updater to actuate InPlaceOrRecreate updates
maxcao13 Mar 22, 2025
7df0c2f
VPA: Updater in-place updates unit tests
maxcao13 Mar 22, 2025
d6376c4
VPA: fixup vpa-process-yaml.sh script
maxcao13 Mar 24, 2025
15883dc
VPA: Update vpa-rbac.yaml for allowing in place resize requests
maxcao13 Mar 24, 2025
9eac8fc
VPA: refactor in-place and eviction logic
maxcao13 Apr 3, 2025
c5eecc6
address raywainman and omerap12 comments
maxcao13 Apr 17, 2025
11e7560
Add docs for in-place updates
omerap12 May 4, 2025
94d55a5
Update vertical-pod-autoscaler/docs/features.md
omerap12 May 4, 2025
036a482
Adjust comments
omerap12 May 4, 2025
8806d18
Update features.md
omerap12 May 5, 2025
8a9a4b8
VPA: bump up overall e2e test timeout
maxcao13 Apr 25, 2025
087e946
VPA: add InPlaceOrRecreate e2e tests
maxcao13 Apr 29, 2025
4f18830
VPA: refactor e2e test ginkgo wrapper functions
maxcao13 May 7, 2025
2a3764d
VPA: use sha256 digest for local kind image
maxcao13 May 9, 2025
66b4c96
VPA: fix InPlaceOrRecreate feature gate version
maxcao13 May 9, 2025
3039f3c
VPA: upgrade InPlacePodVerticalScaling internal logic to k8s 1.33
maxcao13 May 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,3 @@ spec:
- name: tls-certs
secret:
secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
name: vpa-webhook
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8000
selector:
app: vpa-admission-controller
11 changes: 11 additions & 0 deletions vertical-pod-autoscaler/deploy/admission-controller-service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: Service
metadata:
name: vpa-webhook
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8000
selector:
app: vpa-admission-controller
26 changes: 26 additions & 0 deletions vertical-pod-autoscaler/deploy/vpa-rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,32 @@ rules:
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-updater-in-place
rules:
- apiGroups:
- ""
resources:
- pods/resize
- pods # required for patching vpaInPlaceUpdated annotations onto the pod
verbs:
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-updater-in-place-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-updater-in-place
subjects:
- kind: ServiceAccount
name: vpa-updater
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-reader
Expand Down
1 change: 1 addition & 0 deletions vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,7 @@ spec:
- "Off"
- Initial
- Recreate
- InPlaceOrRecreate
- Auto
type: string
type: object
Expand Down
7 changes: 4 additions & 3 deletions vertical-pod-autoscaler/docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ _Appears in:_

| Field | Description | Default | Validation |
| --- | --- | --- | --- |
| `updateMode` _[UpdateMode](#updatemode)_ | Controls when autoscaler applies changes to the pod resources.<br />The default is 'Auto'. | | Enum: [Off Initial Recreate Auto] <br /> |
| `updateMode` _[UpdateMode](#updatemode)_ | Controls when autoscaler applies changes to the pod resources.<br />The default is 'Auto'. | | Enum: [Off Initial Recreate InPlaceOrRecreate Auto] <br /> |
| `minReplicas` _integer_ | Minimal number of replicas which need to be alive for Updater to attempt<br />pod eviction (pending other checks like PDB). Only positive values are<br />allowed. Overrides global '--min-replicas' flag. | | |
| `evictionRequirements` _[EvictionRequirement](#evictionrequirement) array_ | EvictionRequirements is a list of EvictionRequirements that need to<br />evaluate to true in order for a Pod to be evicted. If more than one<br />EvictionRequirement is specified, all of them need to be fulfilled to allow eviction. | | |

Expand Down Expand Up @@ -208,7 +208,7 @@ _Underlying type:_ _string_
UpdateMode controls when autoscaler applies changes to the pod resources.

_Validation:_
- Enum: [Off Initial Recreate Auto]
- Enum: [Off Initial Recreate InPlaceOrRecreate Auto]

_Appears in:_
- [PodUpdatePolicy](#podupdatepolicy)
Expand All @@ -218,7 +218,8 @@ _Appears in:_
| `Off` | UpdateModeOff means that autoscaler never changes Pod resources.<br />The recommender still sets the recommended resources in the<br />VerticalPodAutoscaler object. This can be used for a "dry run".<br /> |
| `Initial` | UpdateModeInitial means that autoscaler only assigns resources on pod<br />creation and does not change them during the lifetime of the pod.<br /> |
| `Recreate` | UpdateModeRecreate means that autoscaler assigns resources on pod<br />creation and additionally can update them during the lifetime of the<br />pod by deleting and recreating the pod.<br /> |
| `Auto` | UpdateModeAuto means that autoscaler assigns resources on pod creation<br />and additionally can update them during the lifetime of the pod,<br />using any available update method. Currently this is equivalent to<br />Recreate, which is the only available update method.<br /> |
| `Auto` | UpdateModeAuto means that autoscaler assigns resources on pod creation<br />and additionally can update them during the lifetime of the pod,<br />using any available update method. Currently this is equivalent to<br />Recreate.<br /> |
| `InPlaceOrRecreate` | UpdateModeInPlaceOrRecreate means that autoscaler tries to assign resources in-place.<br />If this is not possible (e.g., resizing takes too long or is infeasible), it falls back to the<br />"Recreate" update mode.<br />Requires VPA level feature gate "InPlaceOrRecreate" to be enabled<br />on the admission and updater pods.<br />Requires cluster feature gate "InPlacePodVerticalScaling" to be enabled.<br /> |


#### VerticalPodAutoscaler
Expand Down
78 changes: 77 additions & 1 deletion vertical-pod-autoscaler/docs/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

- [Limits control](#limits-control)
- [Memory Value Humanization](#memory-value-humanization)
- [CPU Recommendation Rounding](#cpu-recommendation-rounding)
- [In-Place Updates](#in-place-updates-inplaceorrecreate)

## Limits control

Expand Down Expand Up @@ -50,4 +52,78 @@ To enable this feature, set the --round-cpu-millicores flag when running the VPA

```bash
--round-cpu-millicores=50
```
```

## In-Place Updates (`InPlaceOrRecreate`)

> [!WARNING]
> FEATURE STATE: VPA v1.4.0 [alpha]
VPA supports in-place updates to reduce disruption when applying resource recommendations. This feature leverages Kubernetes' in-place update capabilities (which is in beta as of Kubernetes 1.33) to modify container resources without requiring pod recreation.
For more information, see [AEP-4016: Support for in place updates in VPA](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support)

### Usage

To use in-place updates, set the VPA's `updateMode` to `InPlaceOrRecreate`:
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-vpa
spec:
updatePolicy:
updateMode: "InPlaceOrRecreate"
```
### Behavior
When using `InPlaceOrRecreate` mode, VPA will first attempt to apply updates in-place, if in-place update fails, VPA will fall back to pod recreation.
Updates are attempted when:
* Container requests are outside the recommended bounds
* Quick OOM occurs
* For long-running pods (>12h), when recommendations differ significantly (>10%)

Important Notes

* Disruption Possibility: While in-place updates aim to minimize disruption, they cannot guarantee zero disruption as the container runtime is responsible for the actual resize operation.

* Memory Limit Downscaling: In the beta version, memory limit downscaling is not supported for pods with resizePolicy: PreferNoRestart. In such cases, VPA will fall back to pod recreation.

### Requirements:

* Kubernetes 1.33+ with `InPlacePodVerticalScaling` feature gate enabled
* VPA version 1.4.0+ with `InPlaceOrRecreate` feature gate enabled

### Configuration

Enable the feature by setting the following flags in VPA components ( for both updater and admission-controller ):

```bash
--feature-gates=InPlaceOrRecreate=true
```

### Limitations

* All containers in a pod are updated together (partial updates not supported)
* Memory downscaling requires careful consideration to prevent OOMs
* Updates still respect VPA's standard update conditions and timing restrictions
* In-place updates will fail if they would result in a change to the pod's QoS class

### Fallback Behavior

VPA will fall back to pod recreation in the following scenarios:

* In-place update is [infeasible](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#resize-status) (node resources, etc.)
* Update is [deferred](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#resize-status) for more than 5 minutes
* Update is in progress for more than 1 hour
* [Pod QoS](https://kubernetes.io/docs/concepts/workloads/pods/pod-qos/) class would change due to the update
* Memory limit downscaling is required with [PreferNoRestart policy](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#container-resize-policy)

### Monitoring

VPA provides metrics to track in-place update operations:

* `vpa_in_place_updatable_pods_total`: Number of pods matching in-place update criteria
* `vpa_in_place_updated_pods_total`: Number of pods successfully updated in-place
* `vpa_vpas_with_in_place_updatable_pods_total`: Number of VPAs with pods eligible for in-place updates
* `vpa_vpas_with_in_place_updated_pods_total`: Number of VPAs with successfully in-place updated pods
3 changes: 3 additions & 0 deletions vertical-pod-autoscaler/docs/flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ This document is auto-generated from the flag definitions in the VPA admission-c
| `--address` | ":8944" | The address to expose Prometheus metrics. |
| `--alsologtostderr` | | log to standard error as well as files (no effect when -logtostderr=true) |
| `--client-ca-file` | "/etc/tls-certs/caCert.pem" | Path to CA PEM file. |
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
| `--ignored-vpa-object-namespaces` | | A comma-separated list of namespaces to ignore when searching for VPA objects. Leave empty to avoid ignoring any namespaces. These namespaces will not be cleaned by the garbage collector. |
| `--kube-api-burst` | 10 | QPS burst limit when making requests to Kubernetes apiserver |
| `--kube-api-qps` | 5 | QPS limit when making requests to Kubernetes apiserver |
Expand Down Expand Up @@ -67,6 +68,7 @@ This document is auto-generated from the flag definitions in the VPA recommender
| `--cpu-integer-post-processor-enabled` | | Enable the cpu-integer recommendation post processor. The post processor will round up CPU recommendations to a whole CPU for pods which were opted in by setting an appropriate label on VPA object (experimental) |
| `--external-metrics-cpu-metric` | | ALPHA. Metric to use with external metrics provider for CPU usage. |
| `--external-metrics-memory-metric` | | ALPHA. Metric to use with external metrics provider for memory usage. |
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
| `--history-length` | "8d" | How much time back prometheus have to be queried to get historical metrics |
| `--history-resolution` | "1h" | Resolution at which Prometheus is queried for historical metrics |
| `--humanize-memory` | | Convert memory values in recommendations to the highest appropriate SI unit with up to 2 decimal places for better readability. |
Expand Down Expand Up @@ -137,6 +139,7 @@ This document is auto-generated from the flag definitions in the VPA updater cod
| `--eviction-rate-burst` | 1 | Burst of pods that can be evicted. |
| `--eviction-rate-limit` | | Number of pods that can be evicted per seconds. A rate limit set to 0 or -1 will disable |
| `--eviction-tolerance` | 0.5 | Fraction of replica count that can be evicted for update, if more than one pod can be evicted. |
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
| `--ignored-vpa-object-namespaces` | | A comma-separated list of namespaces to ignore when searching for VPA objects. Leave empty to avoid ignoring any namespaces. These namespaces will not be cleaned by the garbage collector. |
| `--in-recommendation-bounds-eviction-lifetime-threshold` | 12h0m0s | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range |
| `--kube-api-burst` | 10 | QPS burst limit when making requests to Kubernetes apiserver |
Expand Down
10 changes: 10 additions & 0 deletions vertical-pod-autoscaler/docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,16 @@ To print YAML contents with all resources that would be understood by
The output of that command won't include secret information generated by
[pkg/admission-controller/gencerts.sh](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/pkg/admission-controller/gencerts.sh) script.

### Feature gates

To install VPA with feature gates, you can specify the environment variable `$FEATURE_GATES`.

For example, to enable the `InPlaceOrRecreate` feature gate:

```console
FEATURE_GATES="InPlaceOrRecreate=true" ./hack/vpa-up.sh
```

## Tear down

Note that if you stop running VPA in your cluster, the resource requests
Expand Down
Loading
Loading