-
Notifications
You must be signed in to change notification settings - Fork 408
fix: prevent DaemonSet pods from circling in Failed/Pending during disruption #2729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: prevent DaemonSet pods from circling in Failed/Pending during disruption #2729
Conversation
|
Hi @moko-poi. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: moko-poi The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
61a2634 to
9038648
Compare
9038648 to
39d56e1
Compare
Pull Request Test Coverage Report for Build 20403988644Details
💛 - Coveralls |
If I understand correctly this is the core issue, right? The issue isn't that daemonsets are being evicted, it's that some daemonset pods are being recreated and entering a pending / failed state. What's not clear to me is why they're being recreated - the daemonset controller should only create a pod for a node if it tolerates the taints. If it does tolerate the taint, Karpenter shouldn't have disrupted the pod in the first place due to the existing I think there are probably use-cases where we want to drain daemonsets. Some daemonsets perform resource cleanup which may not be possible once the node is terminating. An example that comes to mind is the EBS CSI driver which cleans up VolumeAttachment objects during termination. For this reason I don't think we'd want to exclude daemonsets from the drain process altogether, but we should identify the root cause for these pods being recreated. |
Fixes #2009
Description
This PR fixes an issue where DaemonSet pods (such as
aws-nodeandkube-proxy) enter a Failed/Pending cycle during node disruption (consolidation/termination).Root Cause:
When Karpenter marks a node for disruption, it applies the
karpenter.sh/disrupted:NoScheduletaint to prevent new pods from scheduling. However, DaemonSet pods were being evicted during this process. When the DaemonSet controller attempts to recreate these pods, the NoSchedule taint prevents them from being scheduled back to the node, causing them to remain in Pending state. This creates a continuous cycle of pod failures and rescheduling attempts until the node is fully terminated.Solution:
Modified the
IsEvictable()andIsDrainable()functions inpkg/utils/pod/scheduling.goto explicitly exclude DaemonSet pods from eviction during node disruption. This approach aligns withkubectl drainbehavior, where DaemonSet pods remain running until the node is actually terminated, allowing the DaemonSet controller to naturally manage pod recreation on other available nodes.Changes:
!IsOwnedByDaemonSet(pod)check toIsEvictable()function!IsOwnedByDaemonSet(pod)check toIsDrainable()functionHow was this change tested?
Unit Tests: Added comprehensive test cases in both disruption and termination controllers:
should not evict daemonset pods during node disruption- Verifies DaemonSet pods remain running when disruption taint is appliedshould consider candidates with only daemonset pods- Verifies nodes with only DaemonSet pods can be disruptedshould consider candidates that have fully blocking PDBs on daemonset pods- Verifies PDBs don't block disruption when only DaemonSet pods are presentTest Execution: All existing tests pass with these changes:
Race Detection: Tests executed with
-raceflag to ensure no data races introducedBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.