Karpenter Disrupted Nodes and EBS CSI Volume Attachment

I'm running into an issue where the Karpenter wants to disrupt a node with a stateful set running on it. Then karpenter terminates all the non-daemonset pods on that node. However, when the pod is scheduled to the new node it is unable to start as the volume is still attached to the old node and karpenter is not able to terminate that node:

```bash
$ kubectl describe pod
Status:                    Terminating (lasts 3h5m)
...
Events:
  Type    Reason     Age                   From       Message
  ----    ------     ----                  ----       -------
  Normal  Nominated  6m1s (x79 over 164m)  karpenter  Pod should schedule on: nodeclaim/default-on-demand-p27q8, node/ip-10-221-64-33.ec2.internal
```

When trying to find the volumeattachment and which node its attached to
```bash
kubectl describe volumeattachment csi-c72d43bef46cd68c80357ffa7c5e647f8351bd0b01b2b747cb11f5d702745f7f
Name:         csi-c72d43bef46cd68c80357ffa7c5e647f8351bd0b01b2b747cb11f5d702745f7f
Namespace:
Labels:       <none>
Annotations:  csi.alpha.kubernetes.io/node-id: i-048a0a6c6c9e79dd2
API Version:  storage.k8s.io/v1
Kind:         VolumeAttachment
Metadata:
  Creation Timestamp:  2025-01-24T03:25:51Z
  Finalizers:
    external-attacher/ebs-csi-aws-com
  Resource Version:  913618302
  UID:               26ce1744-6c4d-440b-a54b-aa4e9e02eb5c
Spec:
  Attacher:   ebs.csi.aws.com
  Node Name:  ip-10-221-66-172.ec2.internal
```

You can see the that is another node than what the pod is scheduled to. Looking at the EBS CSI Driver attacher, I don't see any mentions of that attachment:

```bash
$ kubectl logs -n system-storage ebs-csi-driver-controller-659467997f-5rw4s -c csi-attacher | grep csi-c72d43bef46cd68c80357ffa7c5e647f8351bd0b01b2b747cb11f5d702745f7f
<empty> (I confirmed this was the leader....)
```

Once I run `kubectl delete volumeattachment csi-c72d43bef46cd68c80357ffa7c5e647f8351bd0b01b2b747cb11f5d702745f7f` the pod that was stuck terminating comes up on the new node.

What could this issue be caused by? I would expect the EBS CSI attacher to dettach the volume at some point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Karpenter Disrupted Nodes and EBS CSI Volume Attachment #2318

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Karpenter Disrupted Nodes and EBS CSI Volume Attachment #2318

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions