Skip to content

Partitioned NodePool Multi-node ConsolidationΒ #853

Closed as not planned
Closed as not planned
@jashandeep-sohi

Description

@jashandeep-sohi

Description

Observed Behavior:

I have a few NodePools that I'm using in a "partitioned" manner. Basically, each NodePool is made independent using user-defined requirements & taints, and Pods in different namespaces use different Nodepools.

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: one
spec:
   taints:
        - key: example.com/partition-key
          value: one
          effect: NoSchedule
   requirements:
    - key: example.com/partition-key
      operator: In
      values: ["one"]
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: two
spec:
   taints:
        - key: example.com/partition-key
          value: two
          effect: NoSchedule
   requirements:
    - key: example.com/partition-key
      operator: In
      values: ["two"]

This works fine for the most part, but I'm observing issues with multi-node consolidation.

As far as I can tell, multi-node consolidation looks at all deprovisionable Nodes together:
https://github.com/kubernetes-sigs/karpenter/blob/cc54b340f630b46a26d19a3cbd49d90c8b3a6d45/pkg/controllers/disruption/multinodeconsolidation.go#L44C42-L44C42

Which I think means there's no multi-node consolidation happening (or it's sub-optimal at best). Shouldn't this be done on groups of compatible NodePools independently?

Another place where I think this is a problem is when simulating the scheduling you look at all Pending Pods:
https://github.com/kubernetes-sigs/karpenter/blob/cc54b340f630b46a26d19a3cbd49d90c8b3a6d45/pkg/controllers/disruption/helpers.go#L97C35-L97C35

But if one of those Pending Pods is not compatible with the firstN candidates chosen from all Nodes, then simulation will always complain about unschedulable Pods (highly likely as the number of nodes/paritions increase)

Expected Behavior:

NodePools should be consolidated in groups computed based on their requirements or based on some configurable partition key.

Reproduction Steps (Please include YAML): See above

Versions:

  • Chart Version: v0.33.0
  • Kubernetes Version (kubectl version): 1.27
  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    consolidationdeprovisioningIssues related to node deprovisioningkind/featureCategorizes issue or PR as related to a new feature.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.performanceIssues relating to performance (memory usage, cpu usage, timing)v1.xIssues prioritized for post-1.0

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions