-
Notifications
You must be signed in to change notification settings - Fork 23
Description
When I was debugging resource requests for #1219, I kept running into an issue where I had to flip the node selector between base and beefy depending on the amount of CPU and memory I was requesting. This is because of the spack.io/node-pool nodeSelector, which is used to specify what Karpenter NodePool a pod should be scheduled on. While this is useful for scheduling job pods based on microarchitecture, it just makes things confusing for other pods that don't have that requirement (i.e. pods in the base, beefy, and gitlab nodepools), and its usage for those pods is just an artifact from before we switched to Karpenter.
Not only will removing this nodeSelector make the infrastructure less complex, but it will also likely save some money on AWS. This is because, for non-gitlab/runner pods for example, Karpenter will have more flexibility for scheduling pods beyond just whether it needs a "base" or "beefy" node; it will be able to select an EC2 instance that minimally satisfies resource requirements without being limited to what the nodeSelector is set to.
Steps to do this -
- Ensure all affected pods in the cluster have resource requests
- Remove
spack.io/node-poolnodeSelector from those pods - Remove
node.kubernetes.io/instance-typerequirement frombaseNodePool- We want to allow Karpenter to select the instance type it believes is best without limiting it
- Remove
beefyNodePool (and maybe renamebasetodefaultor something?)