CSI driver does not take into account Dynamically provisioned VPC CNIs for allocatable count calculation.

/kind bug

**What happened?**

The `allocatableCount` reported by CSINode on EKS clusters doesn't accurately take into account the actual number of ENIs attached to the nodes for calculating allocatable.count. The calculation only considers statically attached ENIs present at node bootstrap and doesn't account for the ENIs dynamically allocated by the VPC CNI. This leads to a static `allocatableCount` that doesn't update as the VPC CNI attaches more ENIs to accommodate new workloads.


**What you expected to happen?**

The `allocatableCount` should dynamically take into account the actual number of ENIs attached to the node, including both static ENIs and those dynamically provisioned by the VPC CNI. This would provide an accurate representation allocation.count for CSINode.

```
 apiVersion: storage.k8s.io/v1
    kind: CSINode
    metadata:
      annotations:
        storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/aws-ebs,kubernetes.io/azure-disk,kubernetes.io/azure-file,kubernetes.io/cinder,kubernetes.io/gce-pd,kubernetes.io/vsphere-volume
      creationTimestamp: "2024-10-21T09:02:47Z"
      name: xxxxx
      ownerReferences:
        - apiVersion: v1
          kind: Node
          name: xxxxx
          uid: 2da9c59d-1ac2-42bd-9e06-4ec01127153e
    spec:
      drivers:
        - allocatable:
            count: 25
          name: ebs.csi.aws.com
          nodeID: i-xxxx
          topologyKeys:
            - kubernetes.io/os
            - topology.ebs.csi.aws.com/zone
            - topology.kubernetes.io/zone
```


**How to reproduce it (as minimally and precisely as possible)?**

1.  Create an EKS cluster with nodes using an instance type  (e.g., r6 instances).
2.  Observe the initial `allocatableCount` of ENIs reported on CSINode resource. This value will be based on the instance's maximum ENI limit minus the initial ENIs attached + EBS volumes at bootstrap.
3.  Deploy workloads that require the VPC CNI to attach additional ENIs to the nodes.
4.  Observe that the `allocatableCount` remains static even though the actual number of attached ENIs has increased.


**Anything else we need to know?**:

This issue can lead to inaccurate resource reporting, and difficulties in managing workloads. Leading to below errors on workloads.

```
  Warning  FailedAttachVolume  80s (x12 over 56m)  attachdetach-controller  (combined from similar events): AttachVolume.Attach failed for volume "pvc-0c2a501a-bb06-4c9b-95aa-4cda4fb6aac2" : rpc error: code = Internal desc = Could not attach volume "vol-07b6e18e94978a87f" to node "i-0240e6b849f452539": WaitForAttachmentState AttachVolume error, expected device but be attached but was attaching, volumeID="vol-07b6e18e94978a87f", instanceID="i-0240e6b849f452539", Device="/dev/xvdam", err=operation error EC2: AttachVolume, https response error StatusCode: 400, RequestID: 7edb421b-9dc2-4001-af1e-73d628fabfb5, api error VolumeInUse: vol-07b6e18e94978a87f is already attached to an instance
```


**Environment**
- Kubernetes version (use `kubectl version`):
  - Client Version: v1.31.2
  - Kustomize Version: v5.4.2
  - Server Version: v1.29.10-eks-7f9249a
- Driver version: Amazon EBS CSI Driver version: v1.34.0-eksbuild.1 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CSI driver does not take into account Dynamically provisioned VPC CNIs for allocatable count calculation. #2249

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CSI driver does not take into account Dynamically provisioned VPC CNIs for allocatable count calculation. #2249

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions