VPA: Aggregates not pruned when container names change, stale aggregates result in erroneously split minimum resources

**Which component are you using?**:
vertical-pod-autoscaler


**What version of the component are you using?**:


Component version: 1.0.0

**What k8s version are you using (`kubectl version`)?**:

<details><summary><code>kubectl version</code> Output</summary><br><pre>
$ kubectl version
Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2
</pre></details>

**What environment is this in?**:
GCP, kind, probably All 


**What did you expect to happen?**:

When changing the name of a container in a deployment managed by a VPA, I expected the VPA recommender to react as though the container had been replaced, rather than a new container having been added, e.g.:

```
  recommendation:
    containerRecommendations:
    - containerName: sleeper-a
      lowerBound:
        cpu: 25m
        memory: 262144k
      target:
        cpu: 25m
        memory: 262144k
      uncappedTarget:
        cpu: 25m
        memory: 262144k
      upperBound:
        cpu: 25m
        memory: 262144k

```
and then after the rename (from `sleeper-a` to `sleeper-b`) I'd expect something like: 
```
  recommendation:
    containerRecommendations:
    - containerName: sleeper-b
      lowerBound:
        cpu: 25m
        memory: 262144k
      target:
        cpu: 25m
        memory: 262144k
      uncappedTarget:
        cpu: 25m
        memory: 262144k
      upperBound:
        cpu: 25m
        memory: 262144k

```



**What happened instead?**:

The VPA reacts as though a container has been added to the deployment, splits the resources accordingly, keeps everything (aggregates, checkpoints, recommendations) for the old container, and never cleans them up: 
```
recommendation:
    containerRecommendations:
    - containerName: sleeper-a
      lowerBound:
        cpu: 12m
        memory: 131072k
      target:
        cpu: 12m
        memory: 131072k
      uncappedTarget:
        cpu: 12m
        memory: 131072k
      upperBound:
        cpu: 4811m
        memory: "5029681818"
    - containerName: sleeper-b
      lowerBound:
        cpu: 12m
        memory: 131072k
      target:
        cpu: 12m
        memory: 131072k
      uncappedTarget:
        cpu: 12m
        memory: 131072k
      upperBound:
        cpu: 17291m
        memory: "18076954545"
```
This is especially problematic for containers whose resources drop low enough as a result of this split that they start failing health checks. 


**How to reproduce it (as minimally and precisely as possible)**:


This is just a deployment that sleeps and does nothing so it gets the default minimum resources. 

Deployment + VPA: 
```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sleeper
  namespace: default
spec:
  selector:
    matchLabels:
      app: sleeper
  replicas: 2
  template:
    metadata:
      labels:
        app: sleeper
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: sleeper-a
          image: registry.k8s.io/ubuntu-slim:0.1
          resources:
            requests:
              cpu: 100m
              memory: 50Mi
          command: ["/bin/sh"]
          args:
            - "-c"
            - "sleep infinity"
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  annotations:
  name: vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sleeper
  updatePolicy:
    updateMode: Recreate
```

After applying, wait for it to get a recommendation: 
```
[jkyros@jkyros-thinkpadp1gen5 vpa-checkpoints]$ kubectl get vpa -n default vpa 
...
status:
  conditions:
  - lastTransitionTime: "2024-04-18T23:30:50Z"
    status: "True"
    type: RecommendationProvided
  recommendation:
    containerRecommendations:
    - containerName: sleeper-a
      lowerBound:
        cpu: 25m
        memory: 262144k
      target:
        cpu: 25m
        memory: 262144k
      uncappedTarget:
        cpu: 25m
        memory: 262144k
      upperBound:
        cpu: 59411m
        memory: 62111500k

```

Then change the name of the container (here I changed `sleeper-a` to `sleeper-b`):  
```
      containers:
        - name: sleeper-b
          image: registry.k8s.io/ubuntu-slim:0.1
          resources:
            requests:
              cpu: 100m
              memory: 50Mi
          command: ["/bin/sh"]
          args:
            - "-c"
            - "sleep infinity"

```

Watch it roll out: 
```
NAMESPACE            NAME                                                             READY   STATUS      RESTARTS   AGE
default              sleeper-7b5cb584cb-66z2z                                         1/1     Running       0          15s
default              sleeper-7b5cb584cb-w5b2b                                         1/1     Terminating   0          33s
default              sleeper-7b5cb584cb-xxn8h                                         1/1     Running       0          32s
default              sleeper-84949f4f97-68xx6                                         0/1     Terminating   0          3m15s
```
It eventually finishes: 
```
NAMESPACE            NAME                                                             READY   STATUS      RESTARTS   AGE
default              sleeper-7b5cb584cb-66z2z                                         1/1     Running     0          53s
default              sleeper-7b5cb584cb-xxn8h                                         1/1     Running     0          70s
```

The VPA still thinks we have two containers: 
```
  recommendation:
    containerRecommendations:
    - containerName: sleeper-a
      lowerBound:
        cpu: 12m
        memory: 131072k
      target:
        cpu: 12m
        memory: 131072k
      uncappedTarget:
        cpu: 12m
        memory: 131072k
      upperBound:
        cpu: 4811m
        memory: "5029681818"
    - containerName: sleeper-b
      lowerBound:
        cpu: 12m
        memory: 131072k
      target:
        cpu: 12m
        memory: 131072k
      uncappedTarget:
        cpu: 12m
        memory: 131072k
      upperBound:
        cpu: 17291m
        memory: "18076954545"

```

And the default `lowerBound` CPU of `cpu: 25m`has been spread across the containers, with each of them getting `cpu: 12m`, even though only one of them (`sleeper-b`) actually exists anymore. 


**Anything else we need to know?**:


- If you change the container name again, you get a 3rd one, a 4th one, etc and the resources get smaller.
- The old container's stuff ( recommendations, aggregates, checkpoints) doesn't seem to ever get cleaned up, and the recommender keeps maintaining the checkpoints (if you delete one, it comes back) 
- I suspect this is due to a confluence of issues around our handling of containers: 
  - Aggregates [don't get pruned](https://github.com/kubernetes/autoscaler/blob/554366f979b11aeb82df335a793e4d7a1acfadb4/vertical-pod-autoscaler/pkg/recommender/model/cluster.go#L356) when containers are removed from a VPA's pods if the VPA and `targetRef` are otherwise intact
  - `Len()` [over the aggregates](https://github.com/kubernetes/autoscaler/blob/5f94f2c42957448e3fdeb287ed0415c115b547c2/vertical-pod-autoscaler/pkg/recommender/logic/recommender.go#L66 ) during a rollout in which a container name has changed or removed will be wrong (there will be legitimately 2 containers, but each pod will only have one, the resources should not be split) 
  - Checkpoints don't get pruned when containers are removed from a VPA's pods if the VPA and `targetRef` are otherwise intact, and they can get [loaded back in](https://github.com/kubernetes/autoscaler/blob/554366f979b11aeb82df335a793e4d7a1acfadb4/vertical-pod-autoscaler/pkg/recommender/main.go#L186) if the recommender restarts
- Workaround (to prevent "too small resources after split", at least) could be a minimum resource policy, e.g.: 
    ```
    apiVersion: "autoscaling.k8s.io/v1"
    kind: VerticalPodAutoscaler
    metadata:
      name: vpa
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind: Deployment
        name: sleeper
      resourcePolicy:
        containerPolicies:
          - containerName: '*'
            minAllowed:
              cpu: 25
              memory: 50Mi
            controlledResources: ["cpu", "memory"]
     ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

VPA: Aggregates not pruned when container names change, stale aggregates result in erroneously split minimum resources #6744

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VPA: Aggregates not pruned when container names change, stale aggregates result in erroneously split minimum resources #6744

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions