-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Which component are you using?:
/area vertical-pod-autoscaler
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:
Running VPA 1.4.0+ with in-place resource updates enabled against a Kubernetes cluster with InPlacePodVerticalScaling disabled ( see Kubernetes feature gates), does not indicate that the resource update failure is related to incompatible cluster/configuration.
The following snippet shows logs of vpa-updater attempting to in-place update a resource-consumer instance, but due to the missing resize sub-resource
➜ vertical-pod-autoscaler git:(vertical-pod-autoscaler/v1.4.1) ✗ kubectl get --raw='/api/v1' | jq -Mr '.resources.[].name' | grep -i 'pod'
pods
pods/attach
pods/binding
pods/ephemeralcontainers
pods/eviction
pods/exec
pods/log
pods/portforward
pods/proxy
pods/status
podtemplatesthe resource update fails:
I0702 07:33:47.945742 1 update_priority_calculator.go:145] "Pod accepted for update" pod="kube-system/resource-consumer-7fd9594844-dfzft" updatePriority=3.166666666666667 processedRecommendations="resource-consumer: target: 83887k 25m; uncappedTarget: 83887k 25m;"
I0702 07:33:47.946005 1 recommendation_provider.go:121] "Updating requirements for pod" pod="resource-consumer-7fd9594844-dfzft"
I0702 07:33:47.946099 1 pods_inplace_restriction.go:128] "Calculated patches for pod" pod="kube-system/resource-consumer-7fd9594844-dfzft" patches=[{"op":"add","path":"/spec/containers/0/resources/requests/cpu","value":"25m"},{"op":"add","path":"/spec/containers/0/resources/requests/memory","value":"80Mi"}]
I0702 07:33:47.946115 1 pods_inplace_restriction.go:128] "Calculated patches for pod" pod="kube-system/resource-consumer-7fd9594844-dfzft" patches=[{"op":"add","path":"/metadata/annotations/vpaInPlaceUpdated","value":"true"}]
I0702 07:33:47.947660 1 updater.go:286] "In-place update failed" error="the server could not find the requested resource" pod="kube-system/resource-consumer-7fd9594844-dfzft"Leaving a the server could not find the requested resource message that can be misleading to users that are not familiar with the required configuration ( i.e. having the InPlacePodVerticalScaling gate enabled for the apiserver & kubelet as an example ).
The bellow snippets can be used to reproduce the setup used for testing:
- Kind cluster running Kubernetes 1.32:
- use
kind cluster create --config=(/path/to/snippet.yaml)
kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 name: k8s-132-in-place-disabled featureGates: "InPlacePodVerticalScaling": false nodes: - role: control-plane image: kindest/node:v1.32.0 - role: worker image: kindest/node:v1.32.0
- use
- Target the newly created cluster
- Get config:
kind get kubeconfig --name=k8s-132-in-place-disabled > ./kubeconfig-in-place-disabled.yaml - Export config:
export KUBECONFIG=./kubeconfig-in-place-disabled.yaml
- Get config:
- Verify that the
pod/resizesub-resource is not present- Use
kubectl get --raw='/api/v1' | jq -Mr '.resources.[].name' | grep -i 'pod'to listpod*(sub)resources.
- Use
- Instrument
InPlaceOrRecreatefeature gate for:vpa-admission-controller( vertical-pod-autoscaler/deploy/admission-controller-deployment.yaml ):- Add
"--feature-gates=InPlaceOrRecreate=true"to the containerargs
- Add
vpa-updater( vertical-pod-autoscaler/deploy/updater-deployment.yaml ):- Add
- --feature-gates=InPlaceOrRecreate=trueto the containerargs
- Add
- Deploy
vpawith./vertical-pod-autoscaler/hack/vpa-up.sh - Use the bellow snippet to create
resource-consumerdeployment and correspondingvparesource
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: resource-consumer
labels:
app: resource-consumer
spec:
replicas: 3
selector:
matchLabels:
app: resource-consumer
template:
metadata:
labels:
app: resource-consumer
spec:
containers:
- name: resource-consumer
image: gcr.io/k8s-staging-e2e-test-images/resource-consumer:1.9
resources:
requests:
cpu: 10m
memory: 30Mi
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: resource-consumer
labels:
app: resource-consumer
spec:
selector:
app: resource-consumer
ports:
- protocol: TCP
port: 8080
targetPort: 8080
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: resource-consumer
labels:
app: resource-consumer
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: resource-consumer
updatePolicy:
updateMode: "InPlaceOrRecreate"
resourcePolicy:
containerPolicies:
- containerName: resource-consumer
minAllowed:
cpu: 10m
memory: 25Mi
maxAllowed:
cpu: 300m
memory: 300Mi- Update the
vparesource ( named resource-consumer )maxAllowed/minAllowedto trigger a resource update. - Monitor the
vpa-updaterlogs
Describe the solution you'd like.:
Performing a patch request targeting the resize sub-resource:
autoscaler/vertical-pod-autoscaler/pkg/updater/restriction/pods_inplace_restriction.go
Lines 137 to 140 in ffe6219
| res, err := ip.client.CoreV1().Pods(podToUpdate.Namespace).Patch(context.TODO(), podToUpdate.Name, k8stypes.JSONPatchType, patch, metav1.PatchOptions{}, "resize") | |
| if err != nil { | |
| return err | |
| } |
returns an error that does not provide much context. Within the vpa-updater, the function caller does indicate that the error is in-place update related, but it's of little help when debugging:
autoscaler/vertical-pod-autoscaler/pkg/updater/logic/updater.go
Lines 280 to 285 in ffe6219
| err := inPlaceLimiter.InPlaceUpdate(pod, vpa, u.eventRecorder) | |
| if err != nil { | |
| klog.V(0).InfoS("In-place update failed", "error", err, "pod", klog.KObj(pod)) | |
| metrics_updater.RecordFailedInPlaceUpdate(vpaSize, "InPlaceUpdateError") | |
| continue | |
| } |
We need a mechanism for validating if the Kubernetes cluster ( i.e. kube-apiserver & kubelet ) is compatible with in-place updates before performing the patch request and improve the log message to indicate such cases. One possibility is to query the apiserver for such metadata and perform the validation.
Additional context.:
An additional details about testing the behaviour is that running 1.32 cluster with in-place updates feature gate enabled
featureGates:
"InPlacePodVerticalScaling": trueallows the usage of VPA 1.4.0+ with InPlaceOrRecreate updateMode.