Skip to content

Karpenter continues to scale when pod unschedulable as over max volumes of a node #1857

Open
@andrewhibbert

Description

@andrewhibbert

Description

Observed Behavior:

  • Create an STS with 28 PVCs each of 1GB
  • Describe the pod:
Warning  FailedScheduling  14m (x3 over 14m)      default-scheduler  0/24 nodes are available: 1 node(s) had untolerated taint {ebs.csi.aws.com/agent-not-ready: }, 2 node(s) exceed max volume count, 21 node(s) had untolerated taint {eks.amazonaws.com/compute-type: fargate}. preemption: 0/24 nodes are available: 2 No preemption victims found for incoming pod, 22 Preemption is not helpful for scheduling.
  • Karpenter creates new nodes up to the limit:
  Warning  FailedScheduling  45s (x2 over 55s)      default-scheduler  0/27 nodes are available: 21 node(s) had untolerated taint {eks.amazonaws.com/compute-type: fargate}, 6 node(s) exceed max volume count. preemption: 0/27 nodes are available: 21 Preemption is not helpful for scheduling, 6 No preemption victims found for incoming pod.
  • Karpenter logs:
karpenter-7859df8c6f-27kpp controller {"level":"INFO","time":"2024-12-02T13:35:47.623Z","logger":"controller","message":"found provisionable pod(s)","commit":"0f8788c","controller":"provisioner","namespace":"","name":"","reconcileID":"ee72cf7b-be6b-4f8d-a894-4118743651c8","Pods":"default/my-statefulset-0","duration":"129.510294ms"}
karpenter-7859df8c6f-27kpp controller {"level":"INFO","time":"2024-12-02T13:36:55.698Z","logger":"controller","message":"found provisionable pod(s)","commit":"0f8788c","controller":"provisioner","namespace":"","name":"","reconcileID":"e0fea200-4036-4027-a70a-bc65e85b3fac","Pods":"default/my-statefulset-0","duration":"25.925167ms"}
karpenter-7859df8c6f-27kpp controller {"level":"INFO","time":"2024-12-02T13:37:57.529Z","logger":"controller","message":"found provisionable pod(s)","commit":"0f8788c","controller":"provisioner","namespace":"","name":"","reconcileID":"21826f4e-a53e-43c2-8f72-7b230d018987","Pods":"default/my-statefulset-0","duration":"26.505521ms"}
karpenter-7859df8c6f-27kpp controller {"level":"INFO","time":"2024-12-02T13:39:01.361Z","logger":"controller","message":"found provisionable pod(s)","commit":"0f8788c","controller":"provisioner","namespace":"","name":"","reconcileID":"de302962-ccf3-41d6-aa03-799bb11bddee","Pods":"kube-system/efs-csi-controller-7b656fc768-thrcs, kube-system/efs-csi-controller-7b656fc768-8xbg9, default/my-statefulset-0","duration":"26.919137ms"}
karpenter-7859df8c6f-27kpp controller {"level":"INFO","time":"2024-12-02T13:40:01.365Z","logger":"controller","message":"found provisionable pod(s)","commit":"0f8788c","controller":"provisioner","namespace":"","name":"","reconcileID":"2aeeb11a-3d6c-421f-bd09-eaa3d750131f","Pods":"default/my-statefulset-0","duration":"25.630818ms"}
karpenter-7859df8c6f-27kpp controller {"level":"ERROR","time":"2024-12-02T13:41:07.542Z","logger":"controller","message":"could not schedule pod","commit":"0f8788c","controller":"provisioner","namespace":"","name":"","reconcileID":"b2b7338d-71cf-40bf-801d-2f5eaf21bd62","Pod":{"name":"my-statefulset-0","namespace":"default"},"error":"all available instance types exceed limits for nodepool: \"test\""}

Expected Behavior:

Karpenter takes into account volumes when provisioning

Reproduction Steps (Please include YAML):

Versions:

  • Chart Version:
  • Kubernetes Version (kubectl version):
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.triage/needs-informationIndicates an issue needs more information in order to work on it.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions