Feature commit #2697

gangavh1008 · 2025-12-08T01:31:54Z

feat: implement consolidationGracePeriod to prevent consolidation churn - issue 7146

Fixes #N/A

Description
This PR introduces the consolidationGracePeriod feature to address excessive node churn caused by consolidation cycles. The feature makes nodes "invisible" to the consolidation process for a configurable duration after any pod event (add/remove), preventing both source and destination churn.

Problem

With the existing consolidateAfter mechanism, a problematic consolidation cycle emerges:

Karpenter consolidates Node_A, moving pods to Node_B
Node_B receives pods → lastPodEventTime updates → Node_B resets its consolidateAfter timer
Node_B becomes "unconsolidatable" while other stable nodes become targets
Cycle repeats with another stable node being consolidated
Result: Constant node churn, with nodes running only 5-10 minutes before disruption

The core issue is that receiving pods from consolidation makes a node unconsolidatable, creating a feedback loop where the destination of one consolidation becomes protected while source nodes become targets.

Solution
New fields in NodePool Disruption spec:

spec:
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s
    consolidationGracePeriod: 5m                    # NEW: Node invisibility duration

How it works:

When consolidationGracePeriod is configured:

Any pod event (add or remove) on a node updates lastPodEventTime
For the duration of consolidationGracePeriod after lastPodEventTime, the node is invisible to consolidation:
- Cannot be a source (won't be disrupted)
- Cannot be a destination (won't receive pods during consolidation simulation)
The timer resets on every pod event
After the grace period expires, normal consolidation rules apply

Changes

API:
Added ConsolidationGracePeriod field to Disruption struct in pkg/apis/v1/nodepool.go

Disruption Logic:

pkg/controllers/disruption/helpers.go:
Added IsWithinConsolidationGracePeriod() helper function
Modified GetCandidates() to filter out nodes within grace period (source filtering)
Modified SimulateScheduling() to exclude nodes within grace period from destinations

Updated Controllers:

pkg/controllers/disruption/consolidation.go: Pass nodePoolMap and clock to simulation
pkg/controllers/disruption/validation.go: Pass nodePoolMap and clock to simulation
pkg/controllers/disruption/drift.go: Pass nodePoolMap and clock to simulation
pkg/controllers/disruption/controller.go: Updated NewMethods signature

CRDs:

Updated pkg/apis/crds/karpenter.sh_nodepools.yaml
Updated kwok/charts/crds/karpenter.sh_nodepools.yaml

Documentation:

designs/use-on-consolidation-after.md: Design document
designs/use-on-consolidation-after-analysis.md: Critical analysis
designs/consolidationGracePeriod-test-evidence.md: Test evidence with Karpenter logs

How was this change tested?

Unit Tests:

All existing tests pass (232 disruption tests, 47 nodeclaim disruption tests)
Verified compilation with go build ./...

EKS Integration Testing:

Deployed custom Karpenter image to EKS cluster
Tested with consolidationGracePeriod: 90s
Verified 3 scenarios:
✅ New node protected during grace period, visible after expiration
✅ Timer resets on each pod event (add/remove)
✅ Multiple operations with grace period protecting nodes during activity

Migration Path
No migration required: Feature is opt-in via new optional field
Existing NodePools: Continue to work exactly as before
New NodePools: Can opt-in by setting consolidationGracePeriod

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

k8s-ci-robot · 2025-12-08T01:32:01Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gangavh1008
Once this PR has been reviewed and has the lgtm label, please assign tzneal for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-12-08T01:32:03Z

Welcome @gangavh1008!

It looks like this is your first PR to kubernetes-sigs/karpenter 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/karpenter has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2025-12-08T01:32:04Z

Hi @gangavh1008. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jukie · 2025-12-08T19:44:31Z

pkg/apis/crds/karpenter.sh_nodepools.yaml

                        - WhenEmpty
                        - WhenEmptyOrUnderutilized
                      type: string
+                    useOnConsolidationAfter:


This name is bit a confusing, would you consider changing it?

sure, will change it. thank you for the review.

Changed the name to consolidationGracePeriod

jukie · 2025-12-08T19:58:09Z

pkg/apis/crds/karpenter.sh_nodepools.yaml

+                        When replicas is set, UseOnConsolidationAfter is simply ignored
+                      pattern: ^(([0-9]+(s|m|h))+|Never)$
+                      type: string
+                    useOnConsolidationUtilizationThreshold:


From a prior discussion with the maintainers about my proposed way to fix this there was suggestion to focus on cost vs utilization since that's what Karpenter optimizes for and adding a utilization gate would be at odds with the core consolidation logic.

For example in AWS an m5.xlarge instance will be cheaper vs m8.xlarge and I'm sure there's other cases where an instance type with more resources could end up being cheaper vs smaller one. There's also the scenario of reserved instances or otherwise special rates on specific instance types such that it could be preferable to run at lower utilization levels.

Thank you for the insightful feedback. You raise a valid point that deserves careful consideration.
The Core Concern
You're correct that Karpenter's consolidation logic optimizes for cost, not utilization. The utilization threshold in this PR could conflict with cost optimization in scenarios like:

Instance type pricing inversions: m5.xlarge might be cheaper than m8.xlarge

Reserved/Savings Plans: Pre-purchased capacity should be used even at low utilization

Spot pricing variations: A larger spot instance might be cheaper than a smaller on-demand one

Proposed Solutions
Cost-Aware Protection
Instead of a utilization threshold, protect nodes that are already cost-optimal - meaning Karpenter's consolidation algorithm determined no cheaper alternative exists:

spec: disruption: useOnConsolidationAfter: 1h # Remove utilization threshold entirely # Protection applies when node is stable AND cost-optimal

I will change the feature implementation to this approach, and submit the commit.

I've removed the utilization threshold entirely. The feature is now simplified to:

spec: disruption: consolidateAfter: 30s consolidationGracePeriod: 1h # Simple grace period, no utilization check

The protection logic is now:

Node becomes consolidatable (stable for consolidateAfter)

Consolidation evaluates the node using its normal cost-based algorithm

If consolidation finds a cheaper option → CONSOLIDATE ✅

If no cheaper option → Grace period applied (prevents re-evaluation for consolidationGracePeriod)

The feature doesn't try to be smarter than Karpenter's consolidation algorithm. It simply provides a cooldown to prevent churn from repeated re-evaluation.

@ellistarn , please review at your convenience, thank you.

ellistarn · 2025-12-17T22:11:40Z

Hey @gangavh1008. I'd suggest coming to alignment with the maintainers in an issue before moving on to implementation.

gangavh1008 · 2025-12-17T22:21:43Z

Hey @gangavh1008. I'd suggest coming to alignment with the maintainers in an issue before moving on to implementation.

Hi @ellistarn, I agree. I went ahead with implementation, prioritizing the internal requirements.

Requesting maintainers to share the feedback, happy to incorporate the changes.

ellistarn · 2025-12-17T22:25:30Z

For context, we're trying to think more broadly about this consolidation space, and a bunch of key stakeholders are about to head oit for the holidays. We want to do better here and agree this is a problem -- I am not sure this is the right approach in the specifics.

gangavh1008 · 2025-12-18T19:35:35Z

For context, we're trying to think more broadly about this consolidation space, and a bunch of key stakeholders are about to head oit for the holidays. We want to do better here and agree this is a problem -- I am not sure this is the right approach in the specifics.

@ellistarn , sure. Thank you for going through the approach. As you suggested, let me put google doc for the consensus building on design with maintainers in the issue.
I'd suggest coming to alignment with the maintainers in an issue before moving on to implementation

Feature commit

b3eb769

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 8, 2025

k8s-ci-robot requested review from tallaxes and tzneal December 8, 2025 01:32

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 8, 2025

gangavh1008 mentioned this pull request Dec 8, 2025

Karpenter "Underutilised" disruption causing excessive node churn aws/karpenter-provider-aws#7146

Open

gangavh1008 added 2 commits December 8, 2025 10:47

restore change

a1f3d77

restore change-2

dd5c2c1

jukie reviewed Dec 8, 2025

View reviewed changes

gangavh1008 added 3 commits December 9, 2025 14:04

Merge branch 'main' into main

23b625a

Renaming of attribute

0d3251d

Logic change

144f464

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 10, 2025

gangavh1008 added 2 commits December 17, 2025 14:26

changes

928a4c2

unit test fixes and cleanup

3506d79

jamesmt-aws mentioned this pull request Dec 22, 2025

feat: add savings threshold to prevent marginal consolidation #2733

Closed

Feature commit #2697

Are you sure you want to change the base?

Feature commit #2697

Uh oh!

Conversation

gangavh1008 commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat: implement consolidationGracePeriod to prevent consolidation churn - issue 7146

Uh oh!

k8s-ci-robot commented Dec 8, 2025

Uh oh!

k8s-ci-robot commented Dec 8, 2025

Uh oh!

k8s-ci-robot commented Dec 8, 2025

Uh oh!

jukie Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gangavh1008 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

gangavh1008 Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

jukie Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gangavh1008 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

gangavh1008 Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

gangavh1008 Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

ellistarn commented Dec 17, 2025

Uh oh!

gangavh1008 commented Dec 17, 2025

Uh oh!

ellistarn commented Dec 17, 2025

Uh oh!

gangavh1008 commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gangavh1008 commented Dec 8, 2025 •

edited

Loading