From 2c072f846f544bc4fb10ed6e33bf4b4d87061891 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Sat, 13 Jul 2024 09:34:52 -0400 Subject: [PATCH 01/15] feat: Get Neuron device and core count from EC2 API for `trn*` and `inf*` instance types --- designs/limits.md | 6 +- examples/workloads/neuron.yaml | 4 +- hack/code/instancetype_testdata_gen/main.go | 24 ++-- hack/codegen.sh | 2 +- pkg/apis/v1/labels.go | 4 + pkg/fake/ec2api.go | 8 +- .../zz_generated.describe_instance_types.go | 106 +++++++++------ pkg/providers/instance/instance.go | 2 + pkg/providers/instancetype/suite_test.go | 102 +++++++++----- pkg/providers/instancetype/types.go | 51 ++++--- .../integration/extended_resources_test.go | 125 ++++++++++++++++++ test/suites/scheduling/suite_test.go | 3 +- .../content/en/preview/concepts/scheduling.md | 2 + 13 files changed, 323 insertions(+), 116 deletions(-) diff --git a/designs/limits.md b/designs/limits.md index d29cf9ef19f0..8972ad5e17bc 100644 --- a/designs/limits.md +++ b/designs/limits.md @@ -12,14 +12,12 @@ The next large problem is the inability to define a hard ceiling on cluster cost We need to provide similar functionality via Karpenter as well wherein there's a hard limit a customer can configure. - ## Current State To address the runaway-scaling problem the current fix in place is to detect if the kubelet for a worker node has never reported its status to the K8s control plane. If it's been longer than 15 minutes, Karpenter assumes that there's a hard failure mode due to which this worker node will never become healthy and terminates the worker node. If the condition map of the node object in the API Server says `NodeStatusNeverUpdated` then we use that as an indicator of the node having never come up. This fix ensures that if there are other scenarios where a worker node has become unhealthy due to a network partition or power outage in a availability zone, we don't terminate those worker nodes. It's important we don't make the static stability of a cluster worse during such an event. On the other hand, if there is an edge case where worker nodes come online and soon go offline, it will lead to runaway scaling again. This edge case should be unlikely to happen in the near term, so this document focuses on just the ability to limit costs within Karpenter. That way even if runaway scaling does occur there's a way to bound it. A longer-term solution to handle the runaway problem will be discussed separately. - ## Proposed Solution for Limits There are two broad forms of limiting we could apply. The first is that we could introduce a limit to the number of in-flight worker node being provisioned at a point in time. A worker node that's in the `NotReady` state could be considered to be in-flight. The second form is an absolute limit of the number of resources Karpenter can provision. @@ -37,6 +35,7 @@ In the above example - `20%` indicates that if at any point in time, more than 2 The good bit about this approach is that we don't constrain how many total worker nodes can be spun up by Karpenter, while also making sure that if we keep launching worker nodes that aren't healthy, we stop the scaling and save costs. The two main problems with this approach though are - + 1. This limit while meant to just constrain the number of unhealthy worker nodes in a cluster, will also inhibit the rate at which Karpenter can respond to pods that aren't schedulable. This somewhat goes against the goal of minimizing launch times of workers. 2. While this helps ensure that costs don't increase due to runaway scaling, it won't help those who want a stricter cap on the amount of resources that's being provisioned even when nodes are otherwise healthy. @@ -62,11 +61,14 @@ As a cost control mechanism, this requires a little more work from our users if [CPU limits](https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#cpu-units), memory limits and GPU limits will be defined similar to resource requests and will not be required by default. Karpenter will also will not default to any limits itself. The list of supported resource types is - + - `cpu` - `memory` - `nvidia.com/gpu` - `amd.com/gpu` - `aws.amazon.com/neuron` +- `aws.amazon.com/neuroncore` +- `aws.amazon.com/neurondevice` - `habana.ai/gaudi` Limits will be defined at the per-provisioner level. We'll rely on the `karpenter.sh/provisioner-name` node label when calculating resource usage by a specific provisioner. This is useful when multiple teams share a single cluster and use separate provisioners since each team's resource consumption will be limited separately. diff --git a/examples/workloads/neuron.yaml b/examples/workloads/neuron.yaml index e7cf74e13230..0902b2b5cdb3 100644 --- a/examples/workloads/neuron.yaml +++ b/examples/workloads/neuron.yaml @@ -21,9 +21,9 @@ spec: name: neuron resources: limits: - aws.amazon.com/neuron: "1" + aws.amazon.com/neurondevice: "1" requests: cpu: "1" memory: 256M securityContext: - allowPrivilegeEscalation: false \ No newline at end of file + allowPrivilegeEscalation: false diff --git a/hack/code/instancetype_testdata_gen/main.go b/hack/code/instancetype_testdata_gen/main.go index e0df1b16163d..9aed9c65d376 100644 --- a/hack/code/instancetype_testdata_gen/main.go +++ b/hack/code/instancetype_testdata_gen/main.go @@ -147,13 +147,14 @@ func getInstanceTypeInfo(info *ec2.InstanceTypeInfo) string { fmt.Fprintf(src, "NvmeSupport: aws.String(\"%s\"),\n", lo.FromPtr(info.EbsInfo.NvmeSupport)) fmt.Fprintf(src, "},\n") } - if info.InferenceAcceleratorInfo != nil { - fmt.Fprintf(src, "InferenceAcceleratorInfo: &ec2.InferenceAcceleratorInfo{\n") - fmt.Fprintf(src, "Accelerators: []*ec2.InferenceDeviceInfo{\n") - for _, elem := range info.InferenceAcceleratorInfo.Accelerators { - fmt.Fprintf(src, getInferenceAcceleratorDeviceInfo(elem)) + if info.NeuronInfo != nil { + fmt.Fprintf(src, "NeuronInfo: &ec2.NeuronInfo{\n") + fmt.Fprintf(src, "NeuronDevices: []*ec2.NeuronDeviceInfo{\n") + for _, elem := range info.NeuronInfo.NeuronDevices { + fmt.Fprintf(src, getNeuronDeviceInfo(elem)) } fmt.Fprintf(src, "},\n") + fmt.Fprintf(src, "TotalNeuronDeviceMemoryInMiB: aws.Int64(%d),\n", lo.FromPtr(info.NeuronInfo.TotalNeuronDeviceMemoryInMiB)) fmt.Fprintf(src, "},\n") } if info.GpuInfo != nil { @@ -199,12 +200,19 @@ func getNetworkCardInfo(info *ec2.NetworkCardInfo) string { return src.String() } -func getInferenceAcceleratorDeviceInfo(info *ec2.InferenceDeviceInfo) string { +func getNeuronDeviceInfo(info *ec2.NeuronDeviceInfo) string { + src := &bytes.Buffer{} fmt.Fprintf(src, "{\n") - fmt.Fprintf(src, "Name: aws.String(\"%s\"),\n", lo.FromPtr(info.Name)) - fmt.Fprintf(src, "Manufacturer: aws.String(\"%s\"),\n", lo.FromPtr(info.Manufacturer)) fmt.Fprintf(src, "Count: aws.Int64(%d),\n", lo.FromPtr(info.Count)) + fmt.Fprintf(src, "Name: aws.String(\"%s\"),\n", lo.FromPtr(info.Name)) + fmt.Fprintf(src, "CoreInfo: &ec2.NeuronDeviceCoreInfo{\n") + fmt.Fprintf(src, "Count: aws.Int64(%d),\n", lo.FromPtr(info.CoreInfo.Count)) + fmt.Fprintf(src, "Version: aws.Int64(%d),\n", lo.FromPtr(info.CoreInfo.Version)) + fmt.Fprintf(src, "},\n") + fmt.Fprintf(src, "MemoryInfo: &ec2.NeuronDeviceMemoryInfo{\n") + fmt.Fprintf(src, "SizeInMiB: aws.Int64(%d),\n", lo.FromPtr(info.MemoryInfo.SizeInMiB)) + fmt.Fprintf(src, "},\n") fmt.Fprintf(src, "},\n") return src.String() } diff --git a/hack/codegen.sh b/hack/codegen.sh index f148e0ca1ca7..bf79b79db911 100755 --- a/hack/codegen.sh +++ b/hack/codegen.sh @@ -46,7 +46,7 @@ instanceTypeTestData() { GENERATED_FILE="pkg/fake/zz_generated.describe_instance_types.go" go run hack/code/instancetype_testdata_gen/main.go --out-file ${GENERATED_FILE} \ - --instance-types t3.large,m5.large,m5.xlarge,p3.8xlarge,g4dn.8xlarge,c6g.large,inf1.2xlarge,inf1.6xlarge,trn1.2xlarge,m5.metal,dl1.24xlarge,m6idn.32xlarge,t4g.small,t4g.xlarge,t4g.medium,g4ad.16xlarge + --instance-types t3.large,m5.large,m5.xlarge,p3.8xlarge,g4dn.8xlarge,c6g.large,inf2.xlarge,inf2.24xlarge,trn1.2xlarge,m5.metal,dl1.24xlarge,m6idn.32xlarge,t4g.small,t4g.xlarge,t4g.medium,g4ad.16xlarge checkForUpdates "${GENERATED_FILE}" } diff --git a/pkg/apis/v1/labels.go b/pkg/apis/v1/labels.go index 561359e57f31..bf2dc3f494a5 100644 --- a/pkg/apis/v1/labels.go +++ b/pkg/apis/v1/labels.go @@ -48,6 +48,7 @@ func init() { LabelInstanceAcceleratorName, LabelInstanceAcceleratorManufacturer, LabelInstanceAcceleratorCount, + LabelInstanceAcceleratorMemory, LabelTopologyZoneID, corev1.LabelWindowsBuild, ) @@ -90,6 +91,8 @@ var ( ResourceNVIDIAGPU corev1.ResourceName = "nvidia.com/gpu" ResourceAMDGPU corev1.ResourceName = "amd.com/gpu" ResourceAWSNeuron corev1.ResourceName = "aws.amazon.com/neuron" + ResourceAWSNeuronCore corev1.ResourceName = "aws.amazon.com/neuroncore" + ResourceAWSNeuronDevice corev1.ResourceName = "aws.amazon.com/neurondevice" ResourceHabanaGaudi corev1.ResourceName = "habana.ai/gaudi" ResourceAWSPodENI corev1.ResourceName = "vpc.amazonaws.com/pod-eni" ResourcePrivateIPv4Address corev1.ResourceName = "vpc.amazonaws.com/PrivateIPv4Address" @@ -120,6 +123,7 @@ var ( LabelInstanceAcceleratorName = apis.Group + "/instance-accelerator-name" LabelInstanceAcceleratorManufacturer = apis.Group + "/instance-accelerator-manufacturer" LabelInstanceAcceleratorCount = apis.Group + "/instance-accelerator-count" + LabelInstanceAcceleratorMemory = apis.Group + "/instance-accelerator-memory" AnnotationEC2NodeClassHash = apis.Group + "/ec2nodeclass-hash" AnnotationClusterNameTaggedCompatability = apis.CompatibilityGroup + "/cluster-name-tagged" AnnotationEC2NodeClassHashVersion = apis.Group + "/ec2nodeclass-hash-version" diff --git a/pkg/fake/ec2api.go b/pkg/fake/ec2api.go index 060e0fb67134..c04190564b50 100644 --- a/pkg/fake/ec2api.go +++ b/pkg/fake/ec2api.go @@ -631,17 +631,21 @@ func (e *EC2API) DescribeInstanceTypeOfferingsWithContext(_ context.Context, _ * Location: aws.String("test-zone-1b"), }, { - InstanceType: aws.String("inf1.2xlarge"), + InstanceType: aws.String("inf2.xlarge"), Location: aws.String("test-zone-1a"), }, { - InstanceType: aws.String("inf1.6xlarge"), + InstanceType: aws.String("inf2.24xlarge"), Location: aws.String("test-zone-1a"), }, { InstanceType: aws.String("trn1.2xlarge"), Location: aws.String("test-zone-1a"), }, + { + InstanceType: aws.String("trn1.32xlarge"), + Location: aws.String("test-zone-1a"), + }, { InstanceType: aws.String("c6g.large"), Location: aws.String("test-zone-1a"), diff --git a/pkg/fake/zz_generated.describe_instance_types.go b/pkg/fake/zz_generated.describe_instance_types.go index da2762eee5f6..f9aa24d44b35 100644 --- a/pkg/fake/zz_generated.describe_instance_types.go +++ b/pkg/fake/zz_generated.describe_instance_types.go @@ -267,107 +267,121 @@ var defaultDescribeInstanceTypesOutput = &ec2.DescribeInstanceTypesOutput{ }, }, { - InstanceType: aws.String("inf1.2xlarge"), + InstanceType: aws.String("inf2.24xlarge"), SupportedUsageClasses: aws.StringSlice([]string{"on-demand", "spot"}), SupportedVirtualizationTypes: aws.StringSlice([]string{"hvm"}), BurstablePerformanceSupported: aws.Bool(false), BareMetal: aws.Bool(false), Hypervisor: aws.String("nitro"), ProcessorInfo: &ec2.ProcessorInfo{ - Manufacturer: aws.String("Intel"), + Manufacturer: aws.String("AMD"), SupportedArchitectures: aws.StringSlice([]string{"x86_64"}), }, VCpuInfo: &ec2.VCpuInfo{ - DefaultCores: aws.Int64(4), - DefaultVCpus: aws.Int64(8), + DefaultCores: aws.Int64(48), + DefaultVCpus: aws.Int64(96), }, MemoryInfo: &ec2.MemoryInfo{ - SizeInMiB: aws.Int64(16384), + SizeInMiB: aws.Int64(393216), }, EbsInfo: &ec2.EbsInfo{ EbsOptimizedInfo: &ec2.EbsOptimizedInfo{ - BaselineBandwidthInMbps: aws.Int64(1190), - BaselineIops: aws.Int64(6000), - BaselineThroughputInMBps: aws.Float64(148.75), - MaximumBandwidthInMbps: aws.Int64(4750), - MaximumIops: aws.Int64(20000), - MaximumThroughputInMBps: aws.Float64(593.75), + BaselineBandwidthInMbps: aws.Int64(30000), + BaselineIops: aws.Int64(120000), + BaselineThroughputInMBps: aws.Float64(3750.00), + MaximumBandwidthInMbps: aws.Int64(30000), + MaximumIops: aws.Int64(120000), + MaximumThroughputInMBps: aws.Float64(3750.00), }, EbsOptimizedSupport: aws.String("default"), EncryptionSupport: aws.String("supported"), NvmeSupport: aws.String("required"), }, - InferenceAcceleratorInfo: &ec2.InferenceAcceleratorInfo{ - Accelerators: []*ec2.InferenceDeviceInfo{ + NeuronInfo: &ec2.NeuronInfo{ + NeuronDevices: []*ec2.NeuronDeviceInfo{ { - Name: aws.String("Inferentia"), - Manufacturer: aws.String("AWS"), - Count: aws.Int64(1), + Count: aws.Int64(6), + Name: aws.String("Inferentia2"), + CoreInfo: &ec2.NeuronDeviceCoreInfo{ + Count: aws.Int64(2), + Version: aws.Int64(2), + }, + MemoryInfo: &ec2.NeuronDeviceMemoryInfo{ + SizeInMiB: aws.Int64(32768), + }, }, }, + TotalNeuronDeviceMemoryInMiB: aws.Int64(196608), }, NetworkInfo: &ec2.NetworkInfo{ - MaximumNetworkInterfaces: aws.Int64(4), - Ipv4AddressesPerInterface: aws.Int64(10), + MaximumNetworkInterfaces: aws.Int64(15), + Ipv4AddressesPerInterface: aws.Int64(50), EncryptionInTransitSupported: aws.Bool(true), DefaultNetworkCardIndex: aws.Int64(0), NetworkCards: []*ec2.NetworkCardInfo{ { NetworkCardIndex: aws.Int64(0), - MaximumNetworkInterfaces: aws.Int64(4), + MaximumNetworkInterfaces: aws.Int64(15), }, }, }, }, { - InstanceType: aws.String("inf1.6xlarge"), + InstanceType: aws.String("inf2.xlarge"), SupportedUsageClasses: aws.StringSlice([]string{"on-demand", "spot"}), SupportedVirtualizationTypes: aws.StringSlice([]string{"hvm"}), BurstablePerformanceSupported: aws.Bool(false), BareMetal: aws.Bool(false), Hypervisor: aws.String("nitro"), ProcessorInfo: &ec2.ProcessorInfo{ - Manufacturer: aws.String("Intel"), + Manufacturer: aws.String("AMD"), SupportedArchitectures: aws.StringSlice([]string{"x86_64"}), }, VCpuInfo: &ec2.VCpuInfo{ - DefaultCores: aws.Int64(12), - DefaultVCpus: aws.Int64(24), + DefaultCores: aws.Int64(2), + DefaultVCpus: aws.Int64(4), }, MemoryInfo: &ec2.MemoryInfo{ - SizeInMiB: aws.Int64(49152), + SizeInMiB: aws.Int64(16384), }, EbsInfo: &ec2.EbsInfo{ EbsOptimizedInfo: &ec2.EbsOptimizedInfo{ - BaselineBandwidthInMbps: aws.Int64(4750), - BaselineIops: aws.Int64(20000), - BaselineThroughputInMBps: aws.Float64(593.75), - MaximumBandwidthInMbps: aws.Int64(4750), - MaximumIops: aws.Int64(20000), - MaximumThroughputInMBps: aws.Float64(593.75), + BaselineBandwidthInMbps: aws.Int64(1250), + BaselineIops: aws.Int64(6000), + BaselineThroughputInMBps: aws.Float64(156.25), + MaximumBandwidthInMbps: aws.Int64(10000), + MaximumIops: aws.Int64(40000), + MaximumThroughputInMBps: aws.Float64(1250.00), }, EbsOptimizedSupport: aws.String("default"), EncryptionSupport: aws.String("supported"), NvmeSupport: aws.String("required"), }, - InferenceAcceleratorInfo: &ec2.InferenceAcceleratorInfo{ - Accelerators: []*ec2.InferenceDeviceInfo{ + NeuronInfo: &ec2.NeuronInfo{ + NeuronDevices: []*ec2.NeuronDeviceInfo{ { - Name: aws.String("Inferentia"), - Manufacturer: aws.String("AWS"), - Count: aws.Int64(4), + Count: aws.Int64(1), + Name: aws.String("Inferentia2"), + CoreInfo: &ec2.NeuronDeviceCoreInfo{ + Count: aws.Int64(2), + Version: aws.Int64(2), + }, + MemoryInfo: &ec2.NeuronDeviceMemoryInfo{ + SizeInMiB: aws.Int64(32768), + }, }, }, + TotalNeuronDeviceMemoryInMiB: aws.Int64(32768), }, NetworkInfo: &ec2.NetworkInfo{ - MaximumNetworkInterfaces: aws.Int64(8), - Ipv4AddressesPerInterface: aws.Int64(30), + MaximumNetworkInterfaces: aws.Int64(4), + Ipv4AddressesPerInterface: aws.Int64(15), EncryptionInTransitSupported: aws.Bool(true), DefaultNetworkCardIndex: aws.Int64(0), NetworkCards: []*ec2.NetworkCardInfo{ { NetworkCardIndex: aws.Int64(0), - MaximumNetworkInterfaces: aws.Int64(8), + MaximumNetworkInterfaces: aws.Int64(4), }, }, }, @@ -821,6 +835,22 @@ var defaultDescribeInstanceTypesOutput = &ec2.DescribeInstanceTypesOutput{ EncryptionSupport: aws.String("supported"), NvmeSupport: aws.String("required"), }, + NeuronInfo: &ec2.NeuronInfo{ + NeuronDevices: []*ec2.NeuronDeviceInfo{ + { + Count: aws.Int64(1), + Name: aws.String("Trainium"), + CoreInfo: &ec2.NeuronDeviceCoreInfo{ + Count: aws.Int64(2), + Version: aws.Int64(2), + }, + MemoryInfo: &ec2.NeuronDeviceMemoryInfo{ + SizeInMiB: aws.Int64(32768), + }, + }, + }, + TotalNeuronDeviceMemoryInMiB: aws.Int64(32768), + }, InstanceStorageInfo: &ec2.InstanceStorageInfo{NvmeSupport: aws.String("required"), TotalSizeInGB: aws.Int64(474), }, diff --git a/pkg/providers/instance/instance.go b/pkg/providers/instance/instance.go index 54799c00d123..9e5a57ee7d46 100644 --- a/pkg/providers/instance/instance.go +++ b/pkg/providers/instance/instance.go @@ -459,6 +459,8 @@ func filterExoticInstanceTypes(instanceTypes []*cloudprovider.InstanceType) []*c continue } if !resources.IsZero(it.Capacity[v1.ResourceAWSNeuron]) || + !resources.IsZero(it.Capacity[v1.ResourceAWSNeuronCore]) || + !resources.IsZero(it.Capacity[v1.ResourceAWSNeuronDevice]) || !resources.IsZero(it.Capacity[v1.ResourceAMDGPU]) || !resources.IsZero(it.Capacity[v1.ResourceNVIDIAGPU]) || !resources.IsZero(it.Capacity[v1.ResourceHabanaGaudi]) { diff --git a/pkg/providers/instancetype/suite_test.go b/pkg/providers/instancetype/suite_test.go index 2638a2fd7452..6af4002d3ef0 100644 --- a/pkg/providers/instancetype/suite_test.go +++ b/pkg/providers/instancetype/suite_test.go @@ -243,10 +243,12 @@ var _ = Describe("InstanceTypeProvider", func() { v1.LabelInstanceGPUCount: "1", v1.LabelInstanceGPUMemory: "16384", v1.LabelInstanceLocalNVME: "900", - v1.LabelInstanceAcceleratorName: "inferentia", - v1.LabelInstanceAcceleratorManufacturer: "aws", - v1.LabelInstanceAcceleratorCount: "1", - v1.LabelTopologyZoneID: "tstz1-1a", + // TODO - NVIDIA/GPU instances should not have Neuron/accelerator labels + v1.LabelInstanceAcceleratorName: "inferentia2", + v1.LabelInstanceAcceleratorManufacturer: "aws", + v1.LabelInstanceAcceleratorCount: "1", + v1.LabelInstanceAcceleratorMemory: "32768", + v1.LabelTopologyZoneID: "tstz1-1a", // Deprecated Labels corev1.LabelFailureDomainBetaRegion: fake.DefaultRegion, corev1.LabelFailureDomainBetaZone: "test-zone-1a", @@ -315,6 +317,7 @@ var _ = Describe("InstanceTypeProvider", func() { v1.LabelInstanceAcceleratorCount, v1.LabelInstanceAcceleratorName, v1.LabelInstanceAcceleratorManufacturer, + v1.LabelInstanceAcceleratorMemory, corev1.LabelWindowsBuild, )).UnsortedList(), lo.Keys(karpv1.NormalizedLabels)...))) @@ -330,7 +333,7 @@ var _ = Describe("InstanceTypeProvider", func() { karpv1.NodePoolLabelKey: nodePool.Name, corev1.LabelTopologyRegion: fake.DefaultRegion, corev1.LabelTopologyZone: "test-zone-1a", - corev1.LabelInstanceTypeStable: "inf1.2xlarge", + corev1.LabelInstanceTypeStable: "inf2.xlarge", corev1.LabelOSStable: "linux", corev1.LabelArchStable: "amd64", karpv1.CapacityTypeLabelKey: "on-demand", @@ -338,24 +341,25 @@ var _ = Describe("InstanceTypeProvider", func() { v1.LabelInstanceHypervisor: "nitro", v1.LabelInstanceEncryptionInTransitSupported: "true", v1.LabelInstanceCategory: "inf", - v1.LabelInstanceGeneration: "1", - v1.LabelInstanceFamily: "inf1", - v1.LabelInstanceSize: "2xlarge", - v1.LabelInstanceCPU: "8", - v1.LabelInstanceCPUManufacturer: "intel", + v1.LabelInstanceGeneration: "2", + v1.LabelInstanceFamily: "inf2", + v1.LabelInstanceSize: "xlarge", + v1.LabelInstanceCPU: "4", + v1.LabelInstanceCPUManufacturer: "amd", v1.LabelInstanceMemory: "16384", - v1.LabelInstanceEBSBandwidth: "4750", - v1.LabelInstanceNetworkBandwidth: "5000", - v1.LabelInstanceAcceleratorName: "inferentia", + v1.LabelInstanceEBSBandwidth: "10000", + v1.LabelInstanceNetworkBandwidth: "2083", + v1.LabelInstanceAcceleratorName: "inferentia2", v1.LabelInstanceAcceleratorManufacturer: "aws", v1.LabelInstanceAcceleratorCount: "1", + v1.LabelInstanceAcceleratorMemory: "32768", v1.LabelTopologyZoneID: "tstz1-1a", // Deprecated Labels corev1.LabelFailureDomainBetaRegion: fake.DefaultRegion, corev1.LabelFailureDomainBetaZone: "test-zone-1a", "beta.kubernetes.io/arch": "amd64", "beta.kubernetes.io/os": "linux", - corev1.LabelInstanceType: "inf1.2xlarge", + corev1.LabelInstanceType: "inf2.xlarge", "topology.ebs.csi.aws.com/zone": "test-zone-1a", } @@ -755,35 +759,35 @@ var _ = Describe("InstanceTypeProvider", func() { } Expect(nodeNames.Len()).To(Equal(1)) }) - It("should launch instances for aws.amazon.com/neuron resource requests", func() { + It("should launch instances for aws.amazon.com/neurondevice resource requests", func() { nodeNames := sets.NewString() ExpectApplied(ctx, env.Client, nodePool, nodeClass) pods := []*corev1.Pod{ coretest.UnschedulablePod(coretest.PodOptions{ ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, }, }), // Should pack onto same instance coretest.UnschedulablePod(coretest.PodOptions{ ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, }, }), // Should pack onto a separate instance coretest.UnschedulablePod(coretest.PodOptions{ ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("4")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("4")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("4")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("4")}, }, }), } ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pods...) for _, pod := range pods { node := ExpectScheduled(ctx, env.Client, pod) - Expect(node.Labels).To(HaveKeyWithValue(corev1.LabelInstanceTypeStable, "inf1.6xlarge")) + Expect(node.Labels).To(HaveKeyWithValue(corev1.LabelInstanceTypeStable, "inf2.24xlarge")) nodeNames.Insert(node.Name) } Expect(nodeNames.Len()).To(Equal(2)) @@ -816,6 +820,34 @@ var _ = Describe("InstanceTypeProvider", func() { } Expect(nodeNames.Len()).To(Equal(1)) }) + It("should launch inf2 instances for aws.amazon.com/neuroncore resource requests", func() { + nodeNames := sets.NewString() + nodePool.Spec.Template.Spec.Requirements = []karpv1.NodeSelectorRequirementWithMinValues{ + { + NodeSelectorRequirement: corev1.NodeSelectorRequirement{ + Key: corev1.LabelInstanceTypeStable, + Operator: corev1.NodeSelectorOpIn, + Values: []string{"inf2.xlarge"}, + }, + }, + } + ExpectApplied(ctx, env.Client, nodePool, nodeClass) + pods := []*corev1.Pod{ + coretest.UnschedulablePod(coretest.PodOptions{ + ResourceRequirements: corev1.ResourceRequirements{ + Requests: corev1.ResourceList{v1.ResourceAWSNeuronCore: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronCore: resource.MustParse("2")}, + }, + }), + } + ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pods...) + for _, pod := range pods { + node := ExpectScheduled(ctx, env.Client, pod) + Expect(node.Labels).To(HaveKeyWithValue(corev1.LabelInstanceTypeStable, "inf2.xlarge")) + nodeNames.Insert(node.Name) + } + Expect(nodeNames.Len()).To(Equal(1)) + }) It("should launch instances for vpc.amazonaws.com/efa resource requests", func() { nodePool.Spec.Template.Spec.Requirements = []karpv1.NodeSelectorRequirementWithMinValues{ { @@ -1871,26 +1903,26 @@ var _ = Describe("InstanceTypeProvider", func() { }) Context("Insufficient Capacity Error Cache", func() { It("should launch instances of different type on second reconciliation attempt with Insufficient Capacity Error Cache fallback", func() { - awsEnv.EC2API.InsufficientCapacityPools.Set([]fake.CapacityPool{{CapacityType: karpv1.CapacityTypeOnDemand, InstanceType: "inf1.6xlarge", Zone: "test-zone-1a"}}) + awsEnv.EC2API.InsufficientCapacityPools.Set([]fake.CapacityPool{{CapacityType: karpv1.CapacityTypeOnDemand, InstanceType: "inf2.24xlarge", Zone: "test-zone-1a"}}) ExpectApplied(ctx, env.Client, nodePool, nodeClass) pods := []*corev1.Pod{ coretest.UnschedulablePod(coretest.PodOptions{ NodeSelector: map[string]string{corev1.LabelTopologyZone: "test-zone-1a"}, ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, }, }), coretest.UnschedulablePod(coretest.PodOptions{ NodeSelector: map[string]string{corev1.LabelTopologyZone: "test-zone-1a"}, ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, }, }), } ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pods...) - // it should've tried to pack them on a single inf1.6xlarge then hit an insufficient capacity error + // it should've tried to pack them on a single inf2.24xlarge then hit an insufficient capacity error for _, pod := range pods { ExpectNotScheduled(ctx, env.Client, pod) } @@ -1898,7 +1930,7 @@ var _ = Describe("InstanceTypeProvider", func() { ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pods...) for _, pod := range pods { node := ExpectScheduled(ctx, env.Client, pod) - Expect(node.Labels).To(HaveKeyWithValue(v1.LabelInstanceAcceleratorName, "inferentia")) + Expect(node.Labels).To(HaveKeyWithValue(v1.LabelInstanceAcceleratorName, "inferentia2")) nodeNames.Insert(node.Name) } Expect(nodeNames.Len()).To(Equal(2)) @@ -1965,23 +1997,23 @@ var _ = Describe("InstanceTypeProvider", func() { } }) It("should launch instances on later reconciliation attempt with Insufficient Capacity Error Cache expiry", func() { - awsEnv.EC2API.InsufficientCapacityPools.Set([]fake.CapacityPool{{CapacityType: karpv1.CapacityTypeOnDemand, InstanceType: "inf1.6xlarge", Zone: "test-zone-1a"}}) + awsEnv.EC2API.InsufficientCapacityPools.Set([]fake.CapacityPool{{CapacityType: karpv1.CapacityTypeOnDemand, InstanceType: "inf2.24xlarge", Zone: "test-zone-1a"}}) ExpectApplied(ctx, env.Client, nodePool, nodeClass) pod := coretest.UnschedulablePod(coretest.PodOptions{ - NodeSelector: map[string]string{corev1.LabelInstanceTypeStable: "inf1.6xlarge"}, + NodeSelector: map[string]string{corev1.LabelInstanceTypeStable: "inf2.24xlarge"}, ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, }, }) ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pod) ExpectNotScheduled(ctx, env.Client, pod) // capacity shortage is over - expire the item from the cache and try again awsEnv.EC2API.InsufficientCapacityPools.Set([]fake.CapacityPool{}) - awsEnv.UnavailableOfferingsCache.Delete("inf1.6xlarge", "test-zone-1a", karpv1.CapacityTypeOnDemand) + awsEnv.UnavailableOfferingsCache.Delete("inf2.24xlarge", "test-zone-1a", karpv1.CapacityTypeOnDemand) ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pod) node := ExpectScheduled(ctx, env.Client, pod) - Expect(node.Labels).To(HaveKeyWithValue(corev1.LabelInstanceTypeStable, "inf1.6xlarge")) + Expect(node.Labels).To(HaveKeyWithValue(corev1.LabelInstanceTypeStable, "inf2.24xlarge")) }) It("should launch instances in a different zone on second reconciliation attempt with Insufficient Capacity Error Cache fallback (Habana)", func() { awsEnv.EC2API.InsufficientCapacityPools.Set([]fake.CapacityPool{{CapacityType: karpv1.CapacityTypeOnDemand, InstanceType: "dl1.24xlarge", Zone: "test-zone-1a"}}) diff --git a/pkg/providers/instancetype/types.go b/pkg/providers/instancetype/types.go index 90cda92587ca..3bf065b98563 100644 --- a/pkg/providers/instancetype/types.go +++ b/pkg/providers/instancetype/types.go @@ -250,25 +250,18 @@ func computeRequirements(info *ec2.InstanceTypeInfo, offerings cloudprovider.Off requirements.Get(v1.LabelInstanceGPUCount).Insert(fmt.Sprint(aws.Int64Value(gpu.Count))) requirements.Get(v1.LabelInstanceGPUMemory).Insert(fmt.Sprint(aws.Int64Value(gpu.MemoryInfo.SizeInMiB))) } - // Accelerators - if info.InferenceAcceleratorInfo != nil && len(info.InferenceAcceleratorInfo.Accelerators) == 1 { - accelerator := info.InferenceAcceleratorInfo.Accelerators[0] - requirements.Get(v1.LabelInstanceAcceleratorName).Insert(lowerKabobCase(aws.StringValue(accelerator.Name))) - requirements.Get(v1.LabelInstanceAcceleratorManufacturer).Insert(lowerKabobCase(aws.StringValue(accelerator.Manufacturer))) - requirements.Get(v1.LabelInstanceAcceleratorCount).Insert(fmt.Sprint(aws.Int64Value(accelerator.Count))) + // Neuron + if info.NeuronInfo != nil && len(info.NeuronInfo.NeuronDevices) == 1 { + device := info.NeuronInfo.NeuronDevices[0] + requirements.Get(v1.LabelInstanceAcceleratorName).Insert(lowerKabobCase(aws.StringValue(device.Name))) + requirements.Get(v1.LabelInstanceAcceleratorManufacturer).Insert(lowerKabobCase("aws")) + requirements.Get(v1.LabelInstanceAcceleratorCount).Insert(fmt.Sprint(aws.Int64Value(device.Count))) + requirements.Get(v1.LabelInstanceAcceleratorMemory).Insert(fmt.Sprint(aws.Int64Value(info.NeuronInfo.TotalNeuronDeviceMemoryInMiB))) } // Windows Build Version Labels if family, ok := amiFamily.(*amifamily.Windows); ok { requirements.Get(corev1.LabelWindowsBuild).Insert(family.Build) } - // Trn1 Accelerators - // TODO: remove function once DescribeInstanceTypes contains the accelerator data - // Values found from: https://aws.amazon.com/ec2/instance-types/trn1/ - if strings.HasPrefix(*info.InstanceType, "trn1") { - requirements.Get(v1.LabelInstanceAcceleratorName).Insert(lowerKabobCase("Inferentia")) - requirements.Get(v1.LabelInstanceAcceleratorManufacturer).Insert(lowerKabobCase("AWS")) - requirements.Get(v1.LabelInstanceAcceleratorCount).Insert(fmt.Sprint(awsNeurons(info))) - } // CPU Manufacturer, valid options: aws, intel, amd if info.ProcessorInfo != nil { requirements.Get(v1.LabelInstanceCPUManufacturer).Insert(lowerKabobCase(aws.StringValue(info.ProcessorInfo.Manufacturer))) @@ -311,7 +304,9 @@ func computeCapacity(ctx context.Context, info *ec2.InstanceTypeInfo, amiFamily v1.ResourceAWSPodENI: *awsPodENI(aws.StringValue(info.InstanceType)), v1.ResourceNVIDIAGPU: *nvidiaGPUs(info), v1.ResourceAMDGPU: *amdGPUs(info), - v1.ResourceAWSNeuron: *awsNeurons(info), + v1.ResourceAWSNeuron: *awsNeuronDevices(info), + v1.ResourceAWSNeuronCore: *awsNeuronCores(info), + v1.ResourceAWSNeuronDevice: *awsNeuronDevices(info), v1.ResourceHabanaGaudi: *habanaGaudis(info), v1.ResourceEFA: *efas(info), } @@ -406,19 +401,21 @@ func amdGPUs(info *ec2.InstanceTypeInfo) *resource.Quantity { return resources.Quantity(fmt.Sprint(count)) } -// TODO: remove trn1 hardcode values once DescribeInstanceTypes contains the accelerator data -// Values found from: https://aws.amazon.com/ec2/instance-types/trn1/ -func awsNeurons(info *ec2.InstanceTypeInfo) *resource.Quantity { +func awsNeuronCores(info *ec2.InstanceTypeInfo) *resource.Quantity { + count := int64(0) + if info.NeuronInfo != nil { + for _, device := range info.NeuronInfo.NeuronDevices { + count += *device.CoreInfo.Count + } + } + return resources.Quantity(fmt.Sprint(count)) +} + +func awsNeuronDevices(info *ec2.InstanceTypeInfo) *resource.Quantity { count := int64(0) - if *info.InstanceType == "trn1.2xlarge" { - count = int64(1) - } else if *info.InstanceType == "trn1.32xlarge" { - count = int64(16) - } else if *info.InstanceType == "trn1n.32xlarge" { - count = int64(16) - } else if info.InferenceAcceleratorInfo != nil { - for _, accelerator := range info.InferenceAcceleratorInfo.Accelerators { - count += *accelerator.Count + if info.NeuronInfo != nil { + for _, device := range info.NeuronInfo.NeuronDevices { + count += *device.Count } } return resources.Quantity(fmt.Sprint(count)) diff --git a/test/suites/integration/extended_resources_test.go b/test/suites/integration/extended_resources_test.go index 48c1ad0116fe..0c3deee55749 100644 --- a/test/suites/integration/extended_resources_test.go +++ b/test/suites/integration/extended_resources_test.go @@ -106,6 +106,39 @@ var _ = Describe("Extended Resources", func() { env.ExpectCreatedNodeCount("==", 1) env.EventuallyExpectInitializedNodeCount("==", 1) }) + It("should provision nodes for a deployment that requests aws.amazon.com/neurondevice", func() { + ExpectNeuronDevicePluginCreated() + // TODO: jmdeal@ remove AL2 pin once AL2023 accelerated AMIs are available + nodeClass.Spec.AMISelectorTerms = []v1.AMISelectorTerm{{Alias: "al2@latest"}} + numPods := 1 + dep := test.Deployment(test.DeploymentOptions{ + Replicas: int32(numPods), + PodOptions: test.PodOptions{ + ObjectMeta: metav1.ObjectMeta{ + Labels: map[string]string{"app": "large-app"}, + }, + ResourceRequirements: corev1.ResourceRequirements{ + Requests: corev1.ResourceList{ + "aws.amazon.com/neurondevice": resource.MustParse("1"), + }, + Limits: corev1.ResourceList{ + "aws.amazon.com/neurondevice": resource.MustParse("1"), + }, + }, + }, + }) + selector := labels.SelectorFromSet(dep.Spec.Selector.MatchLabels) + test.ReplaceRequirements(nodePool, karpv1.NodeSelectorRequirementWithMinValues{ + NodeSelectorRequirement: corev1.NodeSelectorRequirement{ + Key: v1.LabelInstanceCategory, + Operator: corev1.NodeSelectorOpExists, + }, + }) + env.ExpectCreated(nodeClass, nodePool, dep) + env.EventuallyExpectHealthyPodCount(selector, numPods) + env.ExpectCreatedNodeCount("==", 1) + env.EventuallyExpectInitializedNodeCount("==", 1) + }) It("should provision nodes for a deployment that requests vpc.amazonaws.com/pod-eni (security groups for pods)", func() { env.ExpectPodENIEnabled() DeferCleanup(func() { @@ -329,6 +362,98 @@ func ExpectNvidiaDevicePluginCreated() { }) } +// https://github.com/aws-neuron/aws-neuron-sdk/blob/master/src/k8/k8s-neuron-device-plugin.yml +func ExpectNeuronDevicePluginCreated() { + GinkgoHelper() + env.ExpectCreated(&appsv1.DaemonSet{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Name: "nvidia-device-plugin-daemonset", + Namespace: "kube-system", + }), + Spec: appsv1.DaemonSetSpec{ + Selector: &metav1.LabelSelector{ + MatchLabels: map[string]string{ + "name": "neuron-device-plugin-ds", + }, + }, + UpdateStrategy: appsv1.DaemonSetUpdateStrategy{ + Type: appsv1.RollingUpdateDaemonSetStrategyType, + }, + Template: corev1.PodTemplateSpec{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Labels: map[string]string{ + "name": "neuron-device-plugin-ds", + }, + }), + Spec: corev1.PodSpec{ + Tolerations: []corev1.Toleration{ + { + Key: "aws.amazon.com/neuron", + Operator: corev1.TolerationOpExists, + Effect: corev1.TaintEffectNoSchedule, + }, + }, + PriorityClassName: "system-node-critical", + Containers: []corev1.Container{ + { + Name: "neuron-device-plugin", + Image: "public.ecr.aws/neuron/neuron-device-plugin:2.19.16.0", + Env: []corev1.EnvVar{ + { + Name: "KUBECONFIG", + Value: "/etc/kubernetes/kubelet.conf", + }, + { + Name: "NODE_NAME", + ValueFrom: &corev1.EnvVarSource{ + FieldRef: &corev1.ObjectFieldSelector{ + FieldPath: "spec.nodeName", + }, + }, + }, + }, + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: lo.ToPtr(false), + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{"ALL"}, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "device-plugin", + MountPath: "/var/lib/kubelet/device-plugins", + }, + { + Name: "infa-map", + MountPath: "/run", + }, + }, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "device-plugin", + VolumeSource: corev1.VolumeSource{ + HostPath: &corev1.HostPathVolumeSource{ + Path: "/var/lib/kubelet/device-plugins", + }, + }, + }, + { + Name: "infa-map", + VolumeSource: corev1.VolumeSource{ + HostPath: &corev1.HostPathVolumeSource{ + Path: "/run", + }, + }, + }, + }, + }, + }, + }, + }) +} + func ExpectAMDDevicePluginCreated() { GinkgoHelper() env.ExpectCreated(&appsv1.DaemonSet{ diff --git a/test/suites/scheduling/suite_test.go b/test/suites/scheduling/suite_test.go index e0fe0d09c0f7..d4a704db1bd9 100644 --- a/test/suites/scheduling/suite_test.go +++ b/test/suites/scheduling/suite_test.go @@ -259,9 +259,10 @@ var _ = Describe("Scheduling", Ordered, ContinueOnFailure, func() { env.EventuallyExpectHealthyPodCount(labels.SelectorFromSet(deployment.Spec.Selector.MatchLabels), int(*deployment.Spec.Replicas)) env.ExpectCreatedNodeCount("==", 1) }) - It("should support well-known labels for an accelerator (inferentia)", func() { + It("should support well-known labels for an accelerator (inferentia2)", func() { nodeSelector := map[string]string{ v1.LabelInstanceAcceleratorName: "inferentia", + v1.LabelInstanceAcceleratorMemory: "32768", v1.LabelInstanceAcceleratorManufacturer: "aws", v1.LabelInstanceAcceleratorCount: "1", } diff --git a/website/content/en/preview/concepts/scheduling.md b/website/content/en/preview/concepts/scheduling.md index 2ef5b2c62897..baaf9496fde5 100755 --- a/website/content/en/preview/concepts/scheduling.md +++ b/website/content/en/preview/concepts/scheduling.md @@ -70,6 +70,8 @@ Accelerator (e.g., GPU) values include - `nvidia.com/gpu` - `amd.com/gpu` - `aws.amazon.com/neuron` +- `aws.amazon.com/neuroncore` +- `aws.amazon.com/neurondevice` - `habana.ai/gaudi` Karpenter supports accelerators, such as GPUs. From 981bd4fac5d7947573af24e58c1af9182a3a82d3 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Wed, 11 Sep 2024 16:56:18 -0500 Subject: [PATCH 02/15] chore: Update instance references doc --- hack/docs/instancetypes_gen/main.go | 2 +- pkg/fake/ec2api.go | 4 - .../en/preview/reference/instance-types.md | 233 +++++++++++++++++- 3 files changed, 223 insertions(+), 16 deletions(-) diff --git a/hack/docs/instancetypes_gen/main.go b/hack/docs/instancetypes_gen/main.go index 941cb6d0e27f..31cfd4dc89d0 100644 --- a/hack/docs/instancetypes_gen/main.go +++ b/hack/docs/instancetypes_gen/main.go @@ -124,7 +124,7 @@ below are the resources available with some assumptions and after the instance o resourceNameMap := sets.New[string]() // Iterate through regions and take the union of instance types we discover across both - for _, region := range []string{"us-east-1", "us-west-2"} { + for _, region := range []string{"us-east-1", "us-east-2", "us-west-2"} { sess := session.Must(session.NewSession(&aws.Config{Region: lo.ToPtr(region)})) ec2api := ec2.New(sess) subnetProvider := subnet.NewDefaultProvider(ec2api, cache.New(awscache.DefaultTTL, awscache.DefaultCleanupInterval), cache.New(awscache.AvailableIPAddressTTL, awscache.DefaultCleanupInterval), cache.New(awscache.AssociatePublicIPAddressTTL, awscache.DefaultCleanupInterval)) diff --git a/pkg/fake/ec2api.go b/pkg/fake/ec2api.go index c04190564b50..4412514d3c16 100644 --- a/pkg/fake/ec2api.go +++ b/pkg/fake/ec2api.go @@ -642,10 +642,6 @@ func (e *EC2API) DescribeInstanceTypeOfferingsWithContext(_ context.Context, _ * InstanceType: aws.String("trn1.2xlarge"), Location: aws.String("test-zone-1a"), }, - { - InstanceType: aws.String("trn1.32xlarge"), - Location: aws.String("test-zone-1a"), - }, { InstanceType: aws.String("c6g.large"), Location: aws.String("test-zone-1a"), diff --git a/website/content/en/preview/reference/instance-types.md b/website/content/en/preview/reference/instance-types.md index 40ac3f1708cf..9f424cc84757 100644 --- a/website/content/en/preview/reference/instance-types.md +++ b/website/content/en/preview/reference/instance-types.md @@ -5491,9 +5491,6 @@ below are the resources available with some assumptions and after the instance o #### Labels | Label | Value | |--|--| - |karpenter.k8s.aws/instance-accelerator-count|8| - |karpenter.k8s.aws/instance-accelerator-manufacturer|qualcomm| - |karpenter.k8s.aws/instance-accelerator-name|qualcomm-ai100| |karpenter.k8s.aws/instance-category|dl| |karpenter.k8s.aws/instance-cpu|96| |karpenter.k8s.aws/instance-cpu-manufacturer|intel| @@ -5511,7 +5508,6 @@ below are the resources available with some assumptions and after the instance o #### Resources | Resource | Quantity | |--|--| - |aws.amazon.com/neuron|8| |cpu|95690m| |ephemeral-storage|17Gi| |memory|718987Mi| @@ -7245,6 +7241,166 @@ below are the resources available with some assumptions and after the instance o |ephemeral-storage|17Gi| |memory|237794Mi| |pods|394| +## hpc6a Family +### `hpc6a.48xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|hpc| + |karpenter.k8s.aws/instance-cpu|96| + |karpenter.k8s.aws/instance-cpu-manufacturer|amd| + |karpenter.k8s.aws/instance-ebs-bandwidth|2085| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|hpc6a| + |karpenter.k8s.aws/instance-generation|6| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-memory|393216| + |karpenter.k8s.aws/instance-network-bandwidth|100000| + |karpenter.k8s.aws/instance-size|48xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|hpc6a.48xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|95690m| + |ephemeral-storage|17Gi| + |memory|362269Mi| + |pods|100| + |vpc.amazonaws.com/efa|1| +## hpc6id Family +### `hpc6id.32xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|hpc| + |karpenter.k8s.aws/instance-cpu|64| + |karpenter.k8s.aws/instance-cpu-manufacturer|intel| + |karpenter.k8s.aws/instance-ebs-bandwidth|2085| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|hpc6id| + |karpenter.k8s.aws/instance-generation|6| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-local-nvme|15200| + |karpenter.k8s.aws/instance-memory|1048576| + |karpenter.k8s.aws/instance-network-bandwidth|200000| + |karpenter.k8s.aws/instance-size|32xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|hpc6id.32xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|63770m| + |ephemeral-storage|17Gi| + |memory|969016Mi| + |pods|51| + |vpc.amazonaws.com/efa|2| +## hpc7a Family +### `hpc7a.12xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|hpc| + |karpenter.k8s.aws/instance-cpu|24| + |karpenter.k8s.aws/instance-cpu-manufacturer|amd| + |karpenter.k8s.aws/instance-ebs-bandwidth|2085| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|hpc7a| + |karpenter.k8s.aws/instance-generation|7| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-memory|786432| + |karpenter.k8s.aws/instance-network-bandwidth|300000| + |karpenter.k8s.aws/instance-size|12xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|hpc7a.12xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|23870m| + |ephemeral-storage|17Gi| + |memory|725994Mi| + |pods|100| + |vpc.amazonaws.com/efa|2| +### `hpc7a.24xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|hpc| + |karpenter.k8s.aws/instance-cpu|48| + |karpenter.k8s.aws/instance-cpu-manufacturer|amd| + |karpenter.k8s.aws/instance-ebs-bandwidth|2085| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|hpc7a| + |karpenter.k8s.aws/instance-generation|7| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-memory|786432| + |karpenter.k8s.aws/instance-network-bandwidth|300000| + |karpenter.k8s.aws/instance-size|24xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|hpc7a.24xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|47810m| + |ephemeral-storage|17Gi| + |memory|725994Mi| + |pods|100| + |vpc.amazonaws.com/efa|2| +### `hpc7a.48xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|hpc| + |karpenter.k8s.aws/instance-cpu|96| + |karpenter.k8s.aws/instance-cpu-manufacturer|amd| + |karpenter.k8s.aws/instance-ebs-bandwidth|2085| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|hpc7a| + |karpenter.k8s.aws/instance-generation|7| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-memory|786432| + |karpenter.k8s.aws/instance-network-bandwidth|300000| + |karpenter.k8s.aws/instance-size|48xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|hpc7a.48xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|95690m| + |ephemeral-storage|17Gi| + |memory|725994Mi| + |pods|100| + |vpc.amazonaws.com/efa|2| +### `hpc7a.96xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|hpc| + |karpenter.k8s.aws/instance-cpu|192| + |karpenter.k8s.aws/instance-cpu-manufacturer|amd| + |karpenter.k8s.aws/instance-ebs-bandwidth|2085| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|hpc7a| + |karpenter.k8s.aws/instance-generation|7| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-memory|786432| + |karpenter.k8s.aws/instance-network-bandwidth|300000| + |karpenter.k8s.aws/instance-size|96xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|hpc7a.96xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|191450m| + |ephemeral-storage|17Gi| + |memory|725994Mi| + |pods|100| + |vpc.amazonaws.com/efa|2| ## hpc7g Family ### `hpc7g.4xlarge` #### Labels @@ -8448,6 +8604,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|1| + |aws.amazon.com/neuroncore|4| + |aws.amazon.com/neurondevice|1| |cpu|3920m| |ephemeral-storage|17Gi| |memory|6804Mi| @@ -8478,6 +8636,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|1| + |aws.amazon.com/neuroncore|4| + |aws.amazon.com/neurondevice|1| |cpu|7910m| |ephemeral-storage|17Gi| |memory|14382Mi| @@ -8508,6 +8668,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|4| + |aws.amazon.com/neuroncore|4| + |aws.amazon.com/neurondevice|4| |cpu|23870m| |ephemeral-storage|17Gi| |memory|42536Mi| @@ -8538,6 +8700,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|16| + |aws.amazon.com/neuroncore|4| + |aws.amazon.com/neurondevice|16| |cpu|95690m| |ephemeral-storage|17Gi| |memory|177976Mi| @@ -8551,7 +8715,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|1| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|inferentia2| |karpenter.k8s.aws/instance-category|inf| |karpenter.k8s.aws/instance-cpu|4| |karpenter.k8s.aws/instance-cpu-manufacturer|amd| @@ -8570,6 +8734,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|1| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|1| |cpu|3920m| |ephemeral-storage|17Gi| |memory|14162Mi| @@ -8581,7 +8747,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|1| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|inferentia2| |karpenter.k8s.aws/instance-category|inf| |karpenter.k8s.aws/instance-cpu|32| |karpenter.k8s.aws/instance-cpu-manufacturer|amd| @@ -8600,6 +8766,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|1| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|1| |cpu|31850m| |ephemeral-storage|17Gi| |memory|118312Mi| @@ -8611,7 +8779,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|6| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|inferentia2| |karpenter.k8s.aws/instance-category|inf| |karpenter.k8s.aws/instance-cpu|96| |karpenter.k8s.aws/instance-cpu-manufacturer|amd| @@ -8630,6 +8798,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|6| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|6| |cpu|95690m| |ephemeral-storage|17Gi| |memory|355262Mi| @@ -8641,7 +8811,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|12| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|inferentia2| |karpenter.k8s.aws/instance-category|inf| |karpenter.k8s.aws/instance-cpu|192| |karpenter.k8s.aws/instance-cpu-manufacturer|amd| @@ -8660,6 +8830,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|12| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|12| |cpu|191450m| |ephemeral-storage|17Gi| |memory|718987Mi| @@ -14449,6 +14621,39 @@ below are the resources available with some assumptions and after the instance o |pods|100| |vpc.amazonaws.com/efa|32| |vpc.amazonaws.com/pod-eni|120| +## p5e Family +### `p5e.48xlarge` +#### Labels + | Label | Value | + |--|--| + |karpenter.k8s.aws/instance-category|p| + |karpenter.k8s.aws/instance-cpu|192| + |karpenter.k8s.aws/instance-cpu-manufacturer|amd| + |karpenter.k8s.aws/instance-ebs-bandwidth|80000| + |karpenter.k8s.aws/instance-encryption-in-transit-supported|true| + |karpenter.k8s.aws/instance-family|p5e| + |karpenter.k8s.aws/instance-generation|5| + |karpenter.k8s.aws/instance-gpu-count|8| + |karpenter.k8s.aws/instance-gpu-manufacturer|nvidia| + |karpenter.k8s.aws/instance-gpu-memory|144384| + |karpenter.k8s.aws/instance-gpu-name|h100| + |karpenter.k8s.aws/instance-hypervisor|nitro| + |karpenter.k8s.aws/instance-local-nvme|30400| + |karpenter.k8s.aws/instance-memory|2097152| + |karpenter.k8s.aws/instance-network-bandwidth|3200000| + |karpenter.k8s.aws/instance-size|48xlarge| + |kubernetes.io/arch|amd64| + |kubernetes.io/os|linux| + |node.kubernetes.io/instance-type|p5e.48xlarge| +#### Resources + | Resource | Quantity | + |--|--| + |cpu|191450m| + |ephemeral-storage|17Gi| + |memory|1938410Mi| + |nvidia.com/gpu|8| + |pods|100| + |vpc.amazonaws.com/efa|32| ## r3 Family ### `r3.large` #### Labels @@ -20568,7 +20773,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|1| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|trainium| |karpenter.k8s.aws/instance-category|trn| |karpenter.k8s.aws/instance-cpu|8| |karpenter.k8s.aws/instance-cpu-manufacturer|intel| @@ -20588,6 +20793,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|1| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|1| |cpu|7910m| |ephemeral-storage|17Gi| |memory|29317Mi| @@ -20599,7 +20806,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|16| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|trainium| |karpenter.k8s.aws/instance-category|trn| |karpenter.k8s.aws/instance-cpu|128| |karpenter.k8s.aws/instance-cpu-manufacturer|intel| @@ -20619,6 +20826,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|16| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|16| |cpu|127610m| |ephemeral-storage|17Gi| |memory|481894Mi| @@ -20632,7 +20841,7 @@ below are the resources available with some assumptions and after the instance o |--|--| |karpenter.k8s.aws/instance-accelerator-count|16| |karpenter.k8s.aws/instance-accelerator-manufacturer|aws| - |karpenter.k8s.aws/instance-accelerator-name|inferentia| + |karpenter.k8s.aws/instance-accelerator-name|trainium| |karpenter.k8s.aws/instance-category|trn| |karpenter.k8s.aws/instance-cpu|128| |karpenter.k8s.aws/instance-cpu-manufacturer|intel| @@ -20652,6 +20861,8 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|16| + |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neurondevice|16| |cpu|127610m| |ephemeral-storage|17Gi| |memory|481894Mi| From 656b742b0513dc69947c947317ff45b0ff6b96ac Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Wed, 11 Sep 2024 17:06:52 -0500 Subject: [PATCH 03/15] fix: Correct logic for calculating number of Neuron cores --- pkg/providers/instancetype/types.go | 7 ++++--- .../content/en/preview/reference/instance-types.md | 12 ++++++------ 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/pkg/providers/instancetype/types.go b/pkg/providers/instancetype/types.go index 3bf065b98563..fd0a97f12394 100644 --- a/pkg/providers/instancetype/types.go +++ b/pkg/providers/instancetype/types.go @@ -404,9 +404,10 @@ func amdGPUs(info *ec2.InstanceTypeInfo) *resource.Quantity { func awsNeuronCores(info *ec2.InstanceTypeInfo) *resource.Quantity { count := int64(0) if info.NeuronInfo != nil { - for _, device := range info.NeuronInfo.NeuronDevices { - count += *device.CoreInfo.Count - } + neuronDevice := info.NeuronInfo.NeuronDevices[0] + neuronCorePerDevice := neuronDevice.CoreInfo.Count + + count = *neuronDevice.Count * *neuronCorePerDevice } return resources.Quantity(fmt.Sprint(count)) } diff --git a/website/content/en/preview/reference/instance-types.md b/website/content/en/preview/reference/instance-types.md index 9f424cc84757..f59e2c6266fe 100644 --- a/website/content/en/preview/reference/instance-types.md +++ b/website/content/en/preview/reference/instance-types.md @@ -8668,7 +8668,7 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|4| - |aws.amazon.com/neuroncore|4| + |aws.amazon.com/neuroncore|16| |aws.amazon.com/neurondevice|4| |cpu|23870m| |ephemeral-storage|17Gi| @@ -8700,7 +8700,7 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|16| - |aws.amazon.com/neuroncore|4| + |aws.amazon.com/neuroncore|64| |aws.amazon.com/neurondevice|16| |cpu|95690m| |ephemeral-storage|17Gi| @@ -8798,7 +8798,7 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|6| - |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neuroncore|12| |aws.amazon.com/neurondevice|6| |cpu|95690m| |ephemeral-storage|17Gi| @@ -8830,7 +8830,7 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|12| - |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neuroncore|24| |aws.amazon.com/neurondevice|12| |cpu|191450m| |ephemeral-storage|17Gi| @@ -20826,7 +20826,7 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|16| - |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neuroncore|32| |aws.amazon.com/neurondevice|16| |cpu|127610m| |ephemeral-storage|17Gi| @@ -20861,7 +20861,7 @@ below are the resources available with some assumptions and after the instance o | Resource | Quantity | |--|--| |aws.amazon.com/neuron|16| - |aws.amazon.com/neuroncore|2| + |aws.amazon.com/neuroncore|32| |aws.amazon.com/neurondevice|16| |cpu|127610m| |ephemeral-storage|17Gi| From 8e205a617fb61ae44009ea49dd759656ef92255c Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Wed, 11 Sep 2024 17:21:21 -0500 Subject: [PATCH 04/15] fix: Ensure that instances with accelerators that are not Neuron are populated correctly --- pkg/providers/instancetype/types.go | 9 ++++++++- website/content/en/preview/reference/instance-types.md | 3 +++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/pkg/providers/instancetype/types.go b/pkg/providers/instancetype/types.go index fd0a97f12394..b8820d4a5cd2 100644 --- a/pkg/providers/instancetype/types.go +++ b/pkg/providers/instancetype/types.go @@ -250,6 +250,14 @@ func computeRequirements(info *ec2.InstanceTypeInfo, offerings cloudprovider.Off requirements.Get(v1.LabelInstanceGPUCount).Insert(fmt.Sprint(aws.Int64Value(gpu.Count))) requirements.Get(v1.LabelInstanceGPUMemory).Insert(fmt.Sprint(aws.Int64Value(gpu.MemoryInfo.SizeInMiB))) } + // Accelerators - excluding Neuron + if info.InferenceAcceleratorInfo != nil && len(info.InferenceAcceleratorInfo.Accelerators) == 1 && info.NeuronInfo == nil { + accelerator := info.InferenceAcceleratorInfo.Accelerators[0] + requirements.Get(v1.LabelInstanceAcceleratorName).Insert(lowerKabobCase(aws.StringValue(accelerator.Name))) + requirements.Get(v1.LabelInstanceAcceleratorManufacturer).Insert(lowerKabobCase(aws.StringValue(accelerator.Manufacturer))) + requirements.Get(v1.LabelInstanceAcceleratorCount).Insert(fmt.Sprint(aws.Int64Value(accelerator.Count))) + requirements.Get(v1.LabelInstanceAcceleratorMemory).Insert(fmt.Sprint(aws.Int64Value(info.InferenceAcceleratorInfo.TotalInferenceMemoryInMiB))) + } // Neuron if info.NeuronInfo != nil && len(info.NeuronInfo.NeuronDevices) == 1 { device := info.NeuronInfo.NeuronDevices[0] @@ -406,7 +414,6 @@ func awsNeuronCores(info *ec2.InstanceTypeInfo) *resource.Quantity { if info.NeuronInfo != nil { neuronDevice := info.NeuronInfo.NeuronDevices[0] neuronCorePerDevice := neuronDevice.CoreInfo.Count - count = *neuronDevice.Count * *neuronCorePerDevice } return resources.Quantity(fmt.Sprint(count)) diff --git a/website/content/en/preview/reference/instance-types.md b/website/content/en/preview/reference/instance-types.md index f59e2c6266fe..39ba7c96fd2f 100644 --- a/website/content/en/preview/reference/instance-types.md +++ b/website/content/en/preview/reference/instance-types.md @@ -5491,6 +5491,9 @@ below are the resources available with some assumptions and after the instance o #### Labels | Label | Value | |--|--| + |karpenter.k8s.aws/instance-accelerator-count|8| + |karpenter.k8s.aws/instance-accelerator-manufacturer|qualcomm| + |karpenter.k8s.aws/instance-accelerator-name|qualcomm-ai100| |karpenter.k8s.aws/instance-category|dl| |karpenter.k8s.aws/instance-cpu|96| |karpenter.k8s.aws/instance-cpu-manufacturer|intel| From 0e77584f09a3815e9fcac9b3bce2f51d2b00ca50 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Tue, 1 Oct 2024 12:47:46 -0500 Subject: [PATCH 05/15] fix: Remove support for `neurondevice` --- designs/limits.md | 1 - examples/workloads/neuron.yaml | 2 +- pkg/apis/v1/labels.go | 1 - pkg/providers/instance/instance.go | 1 - pkg/providers/instancetype/suite_test.go | 26 +++++++++---------- pkg/providers/instancetype/types.go | 1 - .../integration/extended_resources_test.go | 6 ++--- .../content/en/preview/concepts/scheduling.md | 1 - .../en/preview/reference/instance-types.md | 13 +--------- 9 files changed, 18 insertions(+), 34 deletions(-) diff --git a/designs/limits.md b/designs/limits.md index 8972ad5e17bc..dbc2b4376a94 100644 --- a/designs/limits.md +++ b/designs/limits.md @@ -68,7 +68,6 @@ The list of supported resource types is - - `amd.com/gpu` - `aws.amazon.com/neuron` - `aws.amazon.com/neuroncore` -- `aws.amazon.com/neurondevice` - `habana.ai/gaudi` Limits will be defined at the per-provisioner level. We'll rely on the `karpenter.sh/provisioner-name` node label when calculating resource usage by a specific provisioner. This is useful when multiple teams share a single cluster and use separate provisioners since each team's resource consumption will be limited separately. diff --git a/examples/workloads/neuron.yaml b/examples/workloads/neuron.yaml index 0902b2b5cdb3..9629eeaad0ba 100644 --- a/examples/workloads/neuron.yaml +++ b/examples/workloads/neuron.yaml @@ -21,7 +21,7 @@ spec: name: neuron resources: limits: - aws.amazon.com/neurondevice: "1" + aws.amazon.com/neuron: "1" requests: cpu: "1" memory: 256M diff --git a/pkg/apis/v1/labels.go b/pkg/apis/v1/labels.go index bf2dc3f494a5..fdc4c784dfc4 100644 --- a/pkg/apis/v1/labels.go +++ b/pkg/apis/v1/labels.go @@ -92,7 +92,6 @@ var ( ResourceAMDGPU corev1.ResourceName = "amd.com/gpu" ResourceAWSNeuron corev1.ResourceName = "aws.amazon.com/neuron" ResourceAWSNeuronCore corev1.ResourceName = "aws.amazon.com/neuroncore" - ResourceAWSNeuronDevice corev1.ResourceName = "aws.amazon.com/neurondevice" ResourceHabanaGaudi corev1.ResourceName = "habana.ai/gaudi" ResourceAWSPodENI corev1.ResourceName = "vpc.amazonaws.com/pod-eni" ResourcePrivateIPv4Address corev1.ResourceName = "vpc.amazonaws.com/PrivateIPv4Address" diff --git a/pkg/providers/instance/instance.go b/pkg/providers/instance/instance.go index 9e5a57ee7d46..06e7ec976d0f 100644 --- a/pkg/providers/instance/instance.go +++ b/pkg/providers/instance/instance.go @@ -460,7 +460,6 @@ func filterExoticInstanceTypes(instanceTypes []*cloudprovider.InstanceType) []*c } if !resources.IsZero(it.Capacity[v1.ResourceAWSNeuron]) || !resources.IsZero(it.Capacity[v1.ResourceAWSNeuronCore]) || - !resources.IsZero(it.Capacity[v1.ResourceAWSNeuronDevice]) || !resources.IsZero(it.Capacity[v1.ResourceAMDGPU]) || !resources.IsZero(it.Capacity[v1.ResourceNVIDIAGPU]) || !resources.IsZero(it.Capacity[v1.ResourceHabanaGaudi]) { diff --git a/pkg/providers/instancetype/suite_test.go b/pkg/providers/instancetype/suite_test.go index 6af4002d3ef0..d6057f8e8b22 100644 --- a/pkg/providers/instancetype/suite_test.go +++ b/pkg/providers/instancetype/suite_test.go @@ -759,28 +759,28 @@ var _ = Describe("InstanceTypeProvider", func() { } Expect(nodeNames.Len()).To(Equal(1)) }) - It("should launch instances for aws.amazon.com/neurondevice resource requests", func() { + It("should launch instances for aws.amazon.com/neuron resource requests", func() { nodeNames := sets.NewString() ExpectApplied(ctx, env.Client, nodePool, nodeClass) pods := []*corev1.Pod{ coretest.UnschedulablePod(coretest.PodOptions{ ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, }, }), // Should pack onto same instance coretest.UnschedulablePod(coretest.PodOptions{ ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, }, }), // Should pack onto a separate instance coretest.UnschedulablePod(coretest.PodOptions{ ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("4")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("4")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("4")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("4")}, }, }), } @@ -1909,15 +1909,15 @@ var _ = Describe("InstanceTypeProvider", func() { coretest.UnschedulablePod(coretest.PodOptions{ NodeSelector: map[string]string{corev1.LabelTopologyZone: "test-zone-1a"}, ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, }, }), coretest.UnschedulablePod(coretest.PodOptions{ NodeSelector: map[string]string{corev1.LabelTopologyZone: "test-zone-1a"}, ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("1")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("1")}, }, }), } @@ -2002,8 +2002,8 @@ var _ = Describe("InstanceTypeProvider", func() { pod := coretest.UnschedulablePod(coretest.PodOptions{ NodeSelector: map[string]string{corev1.LabelInstanceTypeStable: "inf2.24xlarge"}, ResourceRequirements: corev1.ResourceRequirements{ - Requests: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, - Limits: corev1.ResourceList{v1.ResourceAWSNeuronDevice: resource.MustParse("2")}, + Requests: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, + Limits: corev1.ResourceList{v1.ResourceAWSNeuron: resource.MustParse("2")}, }, }) ExpectProvisioned(ctx, env.Client, cluster, cloudProvider, prov, pod) diff --git a/pkg/providers/instancetype/types.go b/pkg/providers/instancetype/types.go index b8820d4a5cd2..e2cced2f6cdf 100644 --- a/pkg/providers/instancetype/types.go +++ b/pkg/providers/instancetype/types.go @@ -314,7 +314,6 @@ func computeCapacity(ctx context.Context, info *ec2.InstanceTypeInfo, amiFamily v1.ResourceAMDGPU: *amdGPUs(info), v1.ResourceAWSNeuron: *awsNeuronDevices(info), v1.ResourceAWSNeuronCore: *awsNeuronCores(info), - v1.ResourceAWSNeuronDevice: *awsNeuronDevices(info), v1.ResourceHabanaGaudi: *habanaGaudis(info), v1.ResourceEFA: *efas(info), } diff --git a/test/suites/integration/extended_resources_test.go b/test/suites/integration/extended_resources_test.go index 0c3deee55749..5af6ce205d46 100644 --- a/test/suites/integration/extended_resources_test.go +++ b/test/suites/integration/extended_resources_test.go @@ -106,7 +106,7 @@ var _ = Describe("Extended Resources", func() { env.ExpectCreatedNodeCount("==", 1) env.EventuallyExpectInitializedNodeCount("==", 1) }) - It("should provision nodes for a deployment that requests aws.amazon.com/neurondevice", func() { + It("should provision nodes for a deployment that requests aws.amazon.com/neuron", func() { ExpectNeuronDevicePluginCreated() // TODO: jmdeal@ remove AL2 pin once AL2023 accelerated AMIs are available nodeClass.Spec.AMISelectorTerms = []v1.AMISelectorTerm{{Alias: "al2@latest"}} @@ -119,10 +119,10 @@ var _ = Describe("Extended Resources", func() { }, ResourceRequirements: corev1.ResourceRequirements{ Requests: corev1.ResourceList{ - "aws.amazon.com/neurondevice": resource.MustParse("1"), + "aws.amazon.com/neuron": resource.MustParse("1"), }, Limits: corev1.ResourceList{ - "aws.amazon.com/neurondevice": resource.MustParse("1"), + "aws.amazon.com/neuron": resource.MustParse("1"), }, }, }, diff --git a/website/content/en/preview/concepts/scheduling.md b/website/content/en/preview/concepts/scheduling.md index baaf9496fde5..161edc87a3b9 100755 --- a/website/content/en/preview/concepts/scheduling.md +++ b/website/content/en/preview/concepts/scheduling.md @@ -71,7 +71,6 @@ Accelerator (e.g., GPU) values include - `amd.com/gpu` - `aws.amazon.com/neuron` - `aws.amazon.com/neuroncore` -- `aws.amazon.com/neurondevice` - `habana.ai/gaudi` Karpenter supports accelerators, such as GPUs. diff --git a/website/content/en/preview/reference/instance-types.md b/website/content/en/preview/reference/instance-types.md index 39ba7c96fd2f..aa7fd6676342 100644 --- a/website/content/en/preview/reference/instance-types.md +++ b/website/content/en/preview/reference/instance-types.md @@ -8608,7 +8608,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|1| |aws.amazon.com/neuroncore|4| - |aws.amazon.com/neurondevice|1| |cpu|3920m| |ephemeral-storage|17Gi| |memory|6804Mi| @@ -8640,7 +8639,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|1| |aws.amazon.com/neuroncore|4| - |aws.amazon.com/neurondevice|1| |cpu|7910m| |ephemeral-storage|17Gi| |memory|14382Mi| @@ -8672,7 +8670,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|4| |aws.amazon.com/neuroncore|16| - |aws.amazon.com/neurondevice|4| |cpu|23870m| |ephemeral-storage|17Gi| |memory|42536Mi| @@ -8704,7 +8701,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|16| |aws.amazon.com/neuroncore|64| - |aws.amazon.com/neurondevice|16| |cpu|95690m| |ephemeral-storage|17Gi| |memory|177976Mi| @@ -8738,7 +8734,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|1| |aws.amazon.com/neuroncore|2| - |aws.amazon.com/neurondevice|1| |cpu|3920m| |ephemeral-storage|17Gi| |memory|14162Mi| @@ -8770,7 +8765,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|1| |aws.amazon.com/neuroncore|2| - |aws.amazon.com/neurondevice|1| |cpu|31850m| |ephemeral-storage|17Gi| |memory|118312Mi| @@ -8802,7 +8796,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|6| |aws.amazon.com/neuroncore|12| - |aws.amazon.com/neurondevice|6| |cpu|95690m| |ephemeral-storage|17Gi| |memory|355262Mi| @@ -8834,7 +8827,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|12| |aws.amazon.com/neuroncore|24| - |aws.amazon.com/neurondevice|12| |cpu|191450m| |ephemeral-storage|17Gi| |memory|718987Mi| @@ -14639,7 +14631,7 @@ below are the resources available with some assumptions and after the instance o |karpenter.k8s.aws/instance-gpu-count|8| |karpenter.k8s.aws/instance-gpu-manufacturer|nvidia| |karpenter.k8s.aws/instance-gpu-memory|144384| - |karpenter.k8s.aws/instance-gpu-name|h100| + |karpenter.k8s.aws/instance-gpu-name|h200| |karpenter.k8s.aws/instance-hypervisor|nitro| |karpenter.k8s.aws/instance-local-nvme|30400| |karpenter.k8s.aws/instance-memory|2097152| @@ -20797,7 +20789,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|1| |aws.amazon.com/neuroncore|2| - |aws.amazon.com/neurondevice|1| |cpu|7910m| |ephemeral-storage|17Gi| |memory|29317Mi| @@ -20830,7 +20821,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|16| |aws.amazon.com/neuroncore|32| - |aws.amazon.com/neurondevice|16| |cpu|127610m| |ephemeral-storage|17Gi| |memory|481894Mi| @@ -20865,7 +20855,6 @@ below are the resources available with some assumptions and after the instance o |--|--| |aws.amazon.com/neuron|16| |aws.amazon.com/neuroncore|32| - |aws.amazon.com/neurondevice|16| |cpu|127610m| |ephemeral-storage|17Gi| |memory|481894Mi| From 8b32323df730d98578f928a285760b48b7004c45 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Tue, 1 Oct 2024 13:08:31 -0500 Subject: [PATCH 06/15] test: Add test case for `neuroncore` --- .../integration/extended_resources_test.go | 33 +++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/test/suites/integration/extended_resources_test.go b/test/suites/integration/extended_resources_test.go index 5af6ce205d46..ab2f04d6b1d6 100644 --- a/test/suites/integration/extended_resources_test.go +++ b/test/suites/integration/extended_resources_test.go @@ -139,6 +139,39 @@ var _ = Describe("Extended Resources", func() { env.ExpectCreatedNodeCount("==", 1) env.EventuallyExpectInitializedNodeCount("==", 1) }) + It("should provision nodes for a deployment that requests aws.amazon.com/neuroncore", func() { + ExpectNeuronDevicePluginCreated() + // TODO: jmdeal@ remove AL2 pin once AL2023 accelerated AMIs are available + nodeClass.Spec.AMISelectorTerms = []v1.AMISelectorTerm{{Alias: "al2@latest"}} + numPods := 1 + dep := test.Deployment(test.DeploymentOptions{ + Replicas: int32(numPods), + PodOptions: test.PodOptions{ + ObjectMeta: metav1.ObjectMeta{ + Labels: map[string]string{"app": "large-app"}, + }, + ResourceRequirements: corev1.ResourceRequirements{ + Requests: corev1.ResourceList{ + "aws.amazon.com/neuroncore": resource.MustParse("2"), + }, + Limits: corev1.ResourceList{ + "aws.amazon.com/neuroncore": resource.MustParse("2"), + }, + }, + }, + }) + selector := labels.SelectorFromSet(dep.Spec.Selector.MatchLabels) + test.ReplaceRequirements(nodePool, karpv1.NodeSelectorRequirementWithMinValues{ + NodeSelectorRequirement: corev1.NodeSelectorRequirement{ + Key: v1.LabelInstanceCategory, + Operator: corev1.NodeSelectorOpExists, + }, + }) + env.ExpectCreated(nodeClass, nodePool, dep) + env.EventuallyExpectHealthyPodCount(selector, numPods) + env.ExpectCreatedNodeCount("==", 1) + env.EventuallyExpectInitializedNodeCount("==", 1) + }) It("should provision nodes for a deployment that requests vpc.amazonaws.com/pod-eni (security groups for pods)", func() { env.ExpectPodENIEnabled() DeferCleanup(func() { From 9b6954603dac63fe7f9f9fffc5fbd29f2f0c8996 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Wed, 2 Oct 2024 11:32:54 -0500 Subject: [PATCH 07/15] fix: Revert support for `/instance-accelerator-memory` label --- hack/code/instancetype_testdata_gen/main.go | 1 - pkg/apis/v1/labels.go | 2 -- pkg/fake/zz_generated.describe_instance_types.go | 3 --- pkg/providers/instancetype/suite_test.go | 3 --- pkg/providers/instancetype/types.go | 2 -- test/suites/scheduling/suite_test.go | 1 - 6 files changed, 12 deletions(-) diff --git a/hack/code/instancetype_testdata_gen/main.go b/hack/code/instancetype_testdata_gen/main.go index 9aed9c65d376..0debaccd22c6 100644 --- a/hack/code/instancetype_testdata_gen/main.go +++ b/hack/code/instancetype_testdata_gen/main.go @@ -154,7 +154,6 @@ func getInstanceTypeInfo(info *ec2.InstanceTypeInfo) string { fmt.Fprintf(src, getNeuronDeviceInfo(elem)) } fmt.Fprintf(src, "},\n") - fmt.Fprintf(src, "TotalNeuronDeviceMemoryInMiB: aws.Int64(%d),\n", lo.FromPtr(info.NeuronInfo.TotalNeuronDeviceMemoryInMiB)) fmt.Fprintf(src, "},\n") } if info.GpuInfo != nil { diff --git a/pkg/apis/v1/labels.go b/pkg/apis/v1/labels.go index fdc4c784dfc4..9bf39a97054e 100644 --- a/pkg/apis/v1/labels.go +++ b/pkg/apis/v1/labels.go @@ -48,7 +48,6 @@ func init() { LabelInstanceAcceleratorName, LabelInstanceAcceleratorManufacturer, LabelInstanceAcceleratorCount, - LabelInstanceAcceleratorMemory, LabelTopologyZoneID, corev1.LabelWindowsBuild, ) @@ -122,7 +121,6 @@ var ( LabelInstanceAcceleratorName = apis.Group + "/instance-accelerator-name" LabelInstanceAcceleratorManufacturer = apis.Group + "/instance-accelerator-manufacturer" LabelInstanceAcceleratorCount = apis.Group + "/instance-accelerator-count" - LabelInstanceAcceleratorMemory = apis.Group + "/instance-accelerator-memory" AnnotationEC2NodeClassHash = apis.Group + "/ec2nodeclass-hash" AnnotationClusterNameTaggedCompatability = apis.CompatibilityGroup + "/cluster-name-tagged" AnnotationEC2NodeClassHashVersion = apis.Group + "/ec2nodeclass-hash-version" diff --git a/pkg/fake/zz_generated.describe_instance_types.go b/pkg/fake/zz_generated.describe_instance_types.go index f9aa24d44b35..d1a1dca2114a 100644 --- a/pkg/fake/zz_generated.describe_instance_types.go +++ b/pkg/fake/zz_generated.describe_instance_types.go @@ -311,7 +311,6 @@ var defaultDescribeInstanceTypesOutput = &ec2.DescribeInstanceTypesOutput{ }, }, }, - TotalNeuronDeviceMemoryInMiB: aws.Int64(196608), }, NetworkInfo: &ec2.NetworkInfo{ MaximumNetworkInterfaces: aws.Int64(15), @@ -371,7 +370,6 @@ var defaultDescribeInstanceTypesOutput = &ec2.DescribeInstanceTypesOutput{ }, }, }, - TotalNeuronDeviceMemoryInMiB: aws.Int64(32768), }, NetworkInfo: &ec2.NetworkInfo{ MaximumNetworkInterfaces: aws.Int64(4), @@ -849,7 +847,6 @@ var defaultDescribeInstanceTypesOutput = &ec2.DescribeInstanceTypesOutput{ }, }, }, - TotalNeuronDeviceMemoryInMiB: aws.Int64(32768), }, InstanceStorageInfo: &ec2.InstanceStorageInfo{NvmeSupport: aws.String("required"), TotalSizeInGB: aws.Int64(474), diff --git a/pkg/providers/instancetype/suite_test.go b/pkg/providers/instancetype/suite_test.go index d6057f8e8b22..d397a6794f37 100644 --- a/pkg/providers/instancetype/suite_test.go +++ b/pkg/providers/instancetype/suite_test.go @@ -247,7 +247,6 @@ var _ = Describe("InstanceTypeProvider", func() { v1.LabelInstanceAcceleratorName: "inferentia2", v1.LabelInstanceAcceleratorManufacturer: "aws", v1.LabelInstanceAcceleratorCount: "1", - v1.LabelInstanceAcceleratorMemory: "32768", v1.LabelTopologyZoneID: "tstz1-1a", // Deprecated Labels corev1.LabelFailureDomainBetaRegion: fake.DefaultRegion, @@ -317,7 +316,6 @@ var _ = Describe("InstanceTypeProvider", func() { v1.LabelInstanceAcceleratorCount, v1.LabelInstanceAcceleratorName, v1.LabelInstanceAcceleratorManufacturer, - v1.LabelInstanceAcceleratorMemory, corev1.LabelWindowsBuild, )).UnsortedList(), lo.Keys(karpv1.NormalizedLabels)...))) @@ -352,7 +350,6 @@ var _ = Describe("InstanceTypeProvider", func() { v1.LabelInstanceAcceleratorName: "inferentia2", v1.LabelInstanceAcceleratorManufacturer: "aws", v1.LabelInstanceAcceleratorCount: "1", - v1.LabelInstanceAcceleratorMemory: "32768", v1.LabelTopologyZoneID: "tstz1-1a", // Deprecated Labels corev1.LabelFailureDomainBetaRegion: fake.DefaultRegion, diff --git a/pkg/providers/instancetype/types.go b/pkg/providers/instancetype/types.go index e2cced2f6cdf..3d1496df8b6a 100644 --- a/pkg/providers/instancetype/types.go +++ b/pkg/providers/instancetype/types.go @@ -256,7 +256,6 @@ func computeRequirements(info *ec2.InstanceTypeInfo, offerings cloudprovider.Off requirements.Get(v1.LabelInstanceAcceleratorName).Insert(lowerKabobCase(aws.StringValue(accelerator.Name))) requirements.Get(v1.LabelInstanceAcceleratorManufacturer).Insert(lowerKabobCase(aws.StringValue(accelerator.Manufacturer))) requirements.Get(v1.LabelInstanceAcceleratorCount).Insert(fmt.Sprint(aws.Int64Value(accelerator.Count))) - requirements.Get(v1.LabelInstanceAcceleratorMemory).Insert(fmt.Sprint(aws.Int64Value(info.InferenceAcceleratorInfo.TotalInferenceMemoryInMiB))) } // Neuron if info.NeuronInfo != nil && len(info.NeuronInfo.NeuronDevices) == 1 { @@ -264,7 +263,6 @@ func computeRequirements(info *ec2.InstanceTypeInfo, offerings cloudprovider.Off requirements.Get(v1.LabelInstanceAcceleratorName).Insert(lowerKabobCase(aws.StringValue(device.Name))) requirements.Get(v1.LabelInstanceAcceleratorManufacturer).Insert(lowerKabobCase("aws")) requirements.Get(v1.LabelInstanceAcceleratorCount).Insert(fmt.Sprint(aws.Int64Value(device.Count))) - requirements.Get(v1.LabelInstanceAcceleratorMemory).Insert(fmt.Sprint(aws.Int64Value(info.NeuronInfo.TotalNeuronDeviceMemoryInMiB))) } // Windows Build Version Labels if family, ok := amiFamily.(*amifamily.Windows); ok { diff --git a/test/suites/scheduling/suite_test.go b/test/suites/scheduling/suite_test.go index d4a704db1bd9..73315a3f8c2b 100644 --- a/test/suites/scheduling/suite_test.go +++ b/test/suites/scheduling/suite_test.go @@ -262,7 +262,6 @@ var _ = Describe("Scheduling", Ordered, ContinueOnFailure, func() { It("should support well-known labels for an accelerator (inferentia2)", func() { nodeSelector := map[string]string{ v1.LabelInstanceAcceleratorName: "inferentia", - v1.LabelInstanceAcceleratorMemory: "32768", v1.LabelInstanceAcceleratorManufacturer: "aws", v1.LabelInstanceAcceleratorCount: "1", } From 3d294c98cfeca066593ac5855f43131a76f4c34d Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Wed, 9 Oct 2024 15:14:14 -0500 Subject: [PATCH 08/15] fix: Add e2e test requirements replacement for inf/trn family generation --- .../integration/extended_resources_test.go | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/test/suites/integration/extended_resources_test.go b/test/suites/integration/extended_resources_test.go index ab2f04d6b1d6..92e8603ff66c 100644 --- a/test/suites/integration/extended_resources_test.go +++ b/test/suites/integration/extended_resources_test.go @@ -134,6 +134,13 @@ var _ = Describe("Extended Resources", func() { Operator: corev1.NodeSelectorOpExists, }, }) + test.ReplaceRequirements(nodePool, karpv1.NodeSelectorRequirementWithMinValues{ + NodeSelectorRequirement: corev1.NodeSelectorRequirement{ + Key: v1.LabelInstanceGeneration, + Operator: corev1.NodeSelectorOpIn, + Values: []string{"1", "2"}, + }, + }) env.ExpectCreated(nodeClass, nodePool, dep) env.EventuallyExpectHealthyPodCount(selector, numPods) env.ExpectCreatedNodeCount("==", 1) @@ -167,6 +174,13 @@ var _ = Describe("Extended Resources", func() { Operator: corev1.NodeSelectorOpExists, }, }) + test.ReplaceRequirements(nodePool, karpv1.NodeSelectorRequirementWithMinValues{ + NodeSelectorRequirement: corev1.NodeSelectorRequirement{ + Key: v1.LabelInstanceGeneration, + Operator: corev1.NodeSelectorOpIn, + Values: []string{"1", "2"}, + }, + }) env.ExpectCreated(nodeClass, nodePool, dep) env.EventuallyExpectHealthyPodCount(selector, numPods) env.ExpectCreatedNodeCount("==", 1) @@ -430,7 +444,7 @@ func ExpectNeuronDevicePluginCreated() { Containers: []corev1.Container{ { Name: "neuron-device-plugin", - Image: "public.ecr.aws/neuron/neuron-device-plugin:2.19.16.0", + Image: "public.ecr.aws/neuron/neuron-device-plugin:2.22.4.0", Env: []corev1.EnvVar{ { Name: "KUBECONFIG", From c5f3b5b8b81fe5336cecdf49f69cd4fb27173446 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Thu, 17 Oct 2024 12:41:22 -0500 Subject: [PATCH 09/15] chore: Get make commands happy --- .../templates/karpenter.k8s.aws_ec2nodeclasses.yaml | 2 +- pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml b/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml index 47901f77f660..abd370251f5c 100644 --- a/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml +++ b/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: - controller-gen.kubebuilder.io/version: v0.16.3 + controller-gen.kubebuilder.io/version: v0.16.4 name: ec2nodeclasses.karpenter.k8s.aws spec: group: karpenter.k8s.aws diff --git a/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml b/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml index 47901f77f660..abd370251f5c 100644 --- a/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml +++ b/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: - controller-gen.kubebuilder.io/version: v0.16.3 + controller-gen.kubebuilder.io/version: v0.16.4 name: ec2nodeclasses.karpenter.k8s.aws spec: group: karpenter.k8s.aws From dc2aa6612ac0081c893b323aed30ab5a98a9f40d Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Fri, 18 Oct 2024 16:41:49 -0500 Subject: [PATCH 10/15] fix: Add Neuron scheduler and RBAC permissions for Neuron device extended resource test --- .../integration/extended_resources_test.go | 432 +++++++++++++++++- 1 file changed, 428 insertions(+), 4 deletions(-) diff --git a/test/suites/integration/extended_resources_test.go b/test/suites/integration/extended_resources_test.go index 92e8603ff66c..5516963279aa 100644 --- a/test/suites/integration/extended_resources_test.go +++ b/test/suites/integration/extended_resources_test.go @@ -22,9 +22,11 @@ import ( "github.com/samber/lo" appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" + rbacv1 "k8s.io/api/rbac/v1" "k8s.io/apimachinery/pkg/api/resource" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/labels" + "k8s.io/apimachinery/pkg/util/intstr" "sigs.k8s.io/karpenter/pkg/test" @@ -412,15 +414,89 @@ func ExpectNvidiaDevicePluginCreated() { // https://github.com/aws-neuron/aws-neuron-sdk/blob/master/src/k8/k8s-neuron-device-plugin.yml func ExpectNeuronDevicePluginCreated() { GinkgoHelper() + + // When selecting more than 1 neuron/neuroncore but less than ALL of the neuron/neuroncores on the instance, + // you must use the Neuron scheduler to schedule neuron/neuroncores in a contiguous manner. + // https://awsdocs-neuron.readthedocs-hosted.com/en/latest/containers/kubernetes-getting-started.html#neuron-scheduler-extension + ExpectK8sNeuronSchedulerCreated() + ExpectNeuronSchedulerExtensionCreated() + + neuronDevicePlugin := "neuron-device-plugin" + + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRole{ + ObjectMeta: metav1.ObjectMeta{ + Name: neuronDevicePlugin, + }, + Rules: []rbacv1.PolicyRule{ + // Device plugin + { + APIGroups: []string{""}, + Resources: []string{"nodes"}, + Verbs: []string{"get", "list", "watch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"events"}, + Verbs: []string{"create", "patch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"pods"}, + Verbs: []string{"update", "patch", "get", "list", "watch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"nodes/status"}, + Verbs: []string{"update", "patch"}, + }, + // Scheduler + { + APIGroups: []string{""}, + Resources: []string{"configmaps"}, + Verbs: []string{"get", "list", "watch"}, + }, + { + APIGroups: []string{"coordination.k8s.io"}, + Resources: []string{"leases"}, + Verbs: []string{"create", "get", "list", "update"}, + }, + }, + }) + + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRoleBinding{ + ObjectMeta: metav1.ObjectMeta{ + Name: neuronDevicePlugin, + }, + RoleRef: rbacv1.RoleRef{ + APIGroup: rbacv1.GroupName, + Kind: "ClusterRole", + Name: neuronDevicePlugin, + }, + Subjects: []rbacv1.Subject{ + { + Kind: "ServiceAccount", + Name: neuronDevicePlugin, + Namespace: "kube-system", + }, + }, + }) + + env.ExpectCreatedOrUpdated(&corev1.ServiceAccount{ + ObjectMeta: metav1.ObjectMeta{ + Name: neuronDevicePlugin, + Namespace: "kube-system", + }, + }) + env.ExpectCreated(&appsv1.DaemonSet{ ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ - Name: "nvidia-device-plugin-daemonset", + Name: neuronDevicePlugin, Namespace: "kube-system", }), Spec: appsv1.DaemonSetSpec{ Selector: &metav1.LabelSelector{ MatchLabels: map[string]string{ - "name": "neuron-device-plugin-ds", + "name": neuronDevicePlugin, }, }, UpdateStrategy: appsv1.DaemonSetUpdateStrategy{ @@ -429,10 +505,11 @@ func ExpectNeuronDevicePluginCreated() { Template: corev1.PodTemplateSpec{ ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ Labels: map[string]string{ - "name": "neuron-device-plugin-ds", + "name": neuronDevicePlugin, }, }), Spec: corev1.PodSpec{ + ServiceAccountName: neuronDevicePlugin, Tolerations: []corev1.Toleration{ { Key: "aws.amazon.com/neuron", @@ -443,7 +520,7 @@ func ExpectNeuronDevicePluginCreated() { PriorityClassName: "system-node-critical", Containers: []corev1.Container{ { - Name: "neuron-device-plugin", + Name: neuronDevicePlugin, Image: "public.ecr.aws/neuron/neuron-device-plugin:2.22.4.0", Env: []corev1.EnvVar{ { @@ -501,6 +578,353 @@ func ExpectNeuronDevicePluginCreated() { }) } +// https://github.com/aws-neuron/aws-neuron-sdk/blob/master/src/k8/k8s-neuron-scheduler-eks.yml +func ExpectK8sNeuronSchedulerCreated() { + GinkgoHelper() + + k8sNeuronScheduler := "k8s-neuron-scheduler" + + env.ExpectCreatedOrUpdated(&corev1.ServiceAccount{ + ObjectMeta: metav1.ObjectMeta{ + Name: k8sNeuronScheduler, + Namespace: "kube-system", + }, + }) + + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRole{ + ObjectMeta: metav1.ObjectMeta{ + Name: k8sNeuronScheduler, + }, + Rules: []rbacv1.PolicyRule{ + { + APIGroups: []string{""}, + Resources: []string{"nodes"}, + Verbs: []string{"get", "list", "watch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"node/status"}, + Verbs: []string{"update", "patch", "get", "list", "watch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"events"}, + Verbs: []string{"create", "patch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"pods"}, + Verbs: []string{"update", "patch", "get", "list", "watch"}, + }, + { + APIGroups: []string{""}, + Resources: []string{"bindings", "pods/bindings"}, + Verbs: []string{"create"}, + }, + }, + }) + + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRoleBinding{ + ObjectMeta: metav1.ObjectMeta{ + Name: k8sNeuronScheduler, + }, + RoleRef: rbacv1.RoleRef{ + APIGroup: rbacv1.GroupName, + Kind: "ClusterRole", + Name: k8sNeuronScheduler, + }, + Subjects: []rbacv1.Subject{ + { + Kind: "ServiceAccount", + Name: k8sNeuronScheduler, + Namespace: "kube-system", + }, + }, + }) + + env.ExpectCreatedOrUpdated(&corev1.Service{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Name: k8sNeuronScheduler, + Namespace: "kube-system", + }), + Spec: corev1.ServiceSpec{ + Selector: map[string]string{ + "app": k8sNeuronScheduler, + }, + Ports: []corev1.ServicePort{ + { + Name: "http", + Port: 12345, + TargetPort: intstr.FromInt(12345), + }, + }, + }, + }) + + replicas := int32(1) + + env.ExpectCreatedOrUpdated(&appsv1.Deployment{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Name: k8sNeuronScheduler, + Namespace: "kube-system", + }), + Spec: appsv1.DeploymentSpec{ + Replicas: &replicas, + Strategy: appsv1.DeploymentStrategy{ + Type: appsv1.RecreateDeploymentStrategyType, + }, + Selector: &metav1.LabelSelector{ + MatchLabels: map[string]string{ + "app": k8sNeuronScheduler, + }, + }, + Template: corev1.PodTemplateSpec{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Labels: map[string]string{ + "app": k8sNeuronScheduler, + }, + Annotations: map[string]string{ + "scheduler.alpha.kubernetes.io/critical-pod": "", + }, + }), + Spec: corev1.PodSpec{ + ServiceAccountName: k8sNeuronScheduler, + PriorityClassName: "system-node-critical", + SchedulerName: k8sNeuronScheduler, + Tolerations: []corev1.Toleration{ + { + Key: "CriticalAddonsOnly", + Operator: corev1.TolerationOpExists, + Effect: corev1.TaintEffectNoSchedule, + }, + }, + Containers: []corev1.Container{ + { + Name: k8sNeuronScheduler, + Image: "public.ecr.aws/neuron/neuron-scheduler:2.22.4.0", + Ports: []corev1.ContainerPort{ + { + Name: "http", + ContainerPort: 12345, + }, + }, + Env: []corev1.EnvVar{ + { + Name: "PORT", + Value: "12345", + }, + }, + }, + }, + }, + }, + }, + }) +} + +// https://github.com/aws-neuron/aws-neuron-sdk/blob/master/src/k8/my-scheduler.yml +func ExpectNeuronSchedulerExtensionCreated() { + GinkgoHelper() + + neuronSchedulerExtension := "neuron-scheduler-ext" + + env.ExpectCreatedOrUpdated(&corev1.ServiceAccount{ + ObjectMeta: metav1.ObjectMeta{ + Name: neuronSchedulerExtension, + Namespace: "kube-system", + }, + }) + + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRole{ + ObjectMeta: metav1.ObjectMeta{ + Name: neuronSchedulerExtension, + }, + Rules: []rbacv1.PolicyRule{ + { + APIGroups: []string{""}, + Resources: []string{"configmaps"}, + Verbs: []string{"get", "list", "watch"}, + }, + { + APIGroups: []string{"coordination.k8s.io"}, + Resources: []string{"leases"}, + Verbs: []string{"create", "get", "list", "update"}, + }, + }, + }) + + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRoleBinding{ + ObjectMeta: metav1.ObjectMeta{ + Name: fmt.Sprintf("%s-kube-scheduler", neuronSchedulerExtension), + }, + Subjects: []rbacv1.Subject{ + { + Kind: "ServiceAccount", + Name: neuronSchedulerExtension, + Namespace: "kube-system", + }, + }, + RoleRef: rbacv1.RoleRef{ + APIGroup: rbacv1.GroupName, + Kind: "ClusterRole", + Name: "system:kube-scheduler", + }, + }) + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRoleBinding{ + ObjectMeta: metav1.ObjectMeta{ + Name: fmt.Sprintf("%s-volume-scheduler", neuronSchedulerExtension), + }, + Subjects: []rbacv1.Subject{ + { + Kind: "ServiceAccount", + Name: neuronSchedulerExtension, + Namespace: "kube-system", + }, + }, + RoleRef: rbacv1.RoleRef{ + APIGroup: rbacv1.GroupName, + Kind: "ClusterRole", + Name: "system:volume-scheduler", + }, + }) + env.ExpectCreatedOrUpdated(&rbacv1.ClusterRoleBinding{ + ObjectMeta: metav1.ObjectMeta{ + Name: neuronSchedulerExtension, + }, + Subjects: []rbacv1.Subject{ + { + Kind: "ServiceAccount", + Name: neuronSchedulerExtension, + Namespace: "kube-system", + }, + }, + RoleRef: rbacv1.RoleRef{ + APIGroup: rbacv1.GroupName, + Kind: "ClusterRole", + Name: neuronSchedulerExtension, + }, + }) + + env.ExpectCreatedOrUpdated(&corev1.ConfigMap{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Name: fmt.Sprintf("%s-config", neuronSchedulerExtension), + Namespace: "kube-system", + }), + Data: map[string]string{ + fmt.Sprintf("%s-config.yaml", neuronSchedulerExtension): fmt.Sprintf(`apiVersion: kubescheduler.config.k8s.io/v1 +kind: KubeSchedulerConfiguration +profiles: + - schedulerName: %[1]v +extenders: + - urlPrefix: 'http://k8s-neuron-scheduler.kube-system.svc.cluster.local:12345' + filterVerb: filter + bindVerb: bind + enableHTTPS: false + nodeCacheCapable: true + managedResources: + - name: 'aws.amazon.com/neuron' + ignoredByScheduler: false + - name: 'aws.amazon.com/neuroncore' + ignoredByScheduler: false + - name: 'aws.amazon.com/neurondevice' + ignoredByScheduler: false + ignorable: false +leaderElection: + leaderElect: true + resourceNamespace: kube-system + resourceName: %[1]v`, neuronSchedulerExtension), + }, + }) + + replicas := int32(1) + + env.ExpectCreatedOrUpdated(&appsv1.Deployment{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Name: neuronSchedulerExtension, + Namespace: "kube-system", + Labels: map[string]string{ + "tier": "control-plane", + }, + }), + Spec: appsv1.DeploymentSpec{ + Replicas: &replicas, + Selector: &metav1.LabelSelector{ + MatchLabels: map[string]string{ + "tier": "control-plane", + }, + }, + Template: corev1.PodTemplateSpec{ + ObjectMeta: test.ObjectMeta(metav1.ObjectMeta{ + Labels: map[string]string{ + "tier": "control-plane", + }, + }), + Spec: corev1.PodSpec{ + ServiceAccountName: neuronSchedulerExtension, + Tolerations: []corev1.Toleration{ + { + Key: "CriticalAddonsOnly", + Operator: corev1.TolerationOpExists, + Effect: corev1.TaintEffectNoSchedule, + }, + }, + Containers: []corev1.Container{ + { + Name: neuronSchedulerExtension, + Args: []string{fmt.Sprintf("--config=/etc/kubernetes/%[1]v/%[1]v-config.yaml", neuronSchedulerExtension), "--leader-elect=true", "--v=2"}, + Command: []string{"/usr/local/bin/kube-scheduler"}, + Image: fmt.Sprintf("public.ecr.aws/eks-distro/kubernetes/kube-scheduler:v1.%[1]v.0-eks-1-%[1]v-latest", env.K8sMinorVersion()), + LivenessProbe: &corev1.Probe{ + InitialDelaySeconds: 15, + ProbeHandler: corev1.ProbeHandler{ + HTTPGet: &corev1.HTTPGetAction{ + Path: "/healthz", + Port: intstr.FromInt(10259), + Scheme: corev1.URISchemeHTTPS, + }, + }, + }, + ReadinessProbe: &corev1.Probe{ + ProbeHandler: corev1.ProbeHandler{ + HTTPGet: &corev1.HTTPGetAction{ + Path: "/healthz", + Port: intstr.FromInt(10259), + Scheme: corev1.URISchemeHTTPS, + }, + }, + }, + SecurityContext: &corev1.SecurityContext{ + Privileged: lo.ToPtr(false), + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "config-volume", + MountPath: fmt.Sprintf("/etc/kubernetes/%s", neuronSchedulerExtension), + ReadOnly: true, + }, + }, + }, + }, + HostNetwork: false, + HostPID: false, + Volumes: []corev1.Volume{ + { + Name: "config-volume", + VolumeSource: corev1.VolumeSource{ + ConfigMap: &corev1.ConfigMapVolumeSource{ + LocalObjectReference: corev1.LocalObjectReference{ + Name: fmt.Sprintf("%s-config", neuronSchedulerExtension), + }, + }, + }, + }, + }, + }, + }, + }, + }) +} + func ExpectAMDDevicePluginCreated() { GinkgoHelper() env.ExpectCreated(&appsv1.DaemonSet{ From 5b8789fe4fe3bf8ca6354ed1a774cf3f25e78e8d Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Fri, 18 Oct 2024 19:28:46 -0500 Subject: [PATCH 11/15] fix: Only request x1 `neuron*` to avoid use of Neuron Scheduler --- test/suites/integration/extended_resources_test.go | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/test/suites/integration/extended_resources_test.go b/test/suites/integration/extended_resources_test.go index 5516963279aa..eb196ff50f8b 100644 --- a/test/suites/integration/extended_resources_test.go +++ b/test/suites/integration/extended_resources_test.go @@ -121,6 +121,8 @@ var _ = Describe("Extended Resources", func() { }, ResourceRequirements: corev1.ResourceRequirements{ Requests: corev1.ResourceList{ + // Only 1 is requested to avoid the use of the Neuron scheduler + // TODO: bryantbiggs@ add the ability to specify the scheduler name to test.PodOptions in order to use the Neuron scheduler "aws.amazon.com/neuron": resource.MustParse("1"), }, Limits: corev1.ResourceList{ @@ -161,10 +163,12 @@ var _ = Describe("Extended Resources", func() { }, ResourceRequirements: corev1.ResourceRequirements{ Requests: corev1.ResourceList{ - "aws.amazon.com/neuroncore": resource.MustParse("2"), + // Only 1 is requested to avoid the use of the Neuron scheduler + // TODO: bryantbiggs@ add the ability to specify the scheduler name to test.PodOptions in order to use the Neuron scheduler + "aws.amazon.com/neuroncore": resource.MustParse("1"), }, Limits: corev1.ResourceList{ - "aws.amazon.com/neuroncore": resource.MustParse("2"), + "aws.amazon.com/neuroncore": resource.MustParse("1"), }, }, }, From 3ad300d8880e798babf67e47b6e6eed4b55b2503 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Fri, 18 Oct 2024 19:32:14 -0500 Subject: [PATCH 12/15] chore: Add autoformatted changes from `make presubmit` and/or `make verify` --- .../karpenter.k8s.aws_ec2nodeclasses.yaml | 1421 +++++++++-------- pkg/apis/crds/karpenter.sh_nodeclaims.yaml | 2 - pkg/apis/crds/karpenter.sh_nodepools.yaml | 4 - 3 files changed, 740 insertions(+), 687 deletions(-) diff --git a/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml b/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml index abd370251f5c..857e89c65326 100644 --- a/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml +++ b/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml @@ -9,731 +9,790 @@ spec: group: karpenter.k8s.aws names: categories: - - karpenter + - karpenter kind: EC2NodeClass listKind: EC2NodeClassList plural: ec2nodeclasses shortNames: - - ec2nc - - ec2ncs + - ec2nc + - ec2ncs singular: ec2nodeclass scope: Cluster versions: - - additionalPrinterColumns: - - jsonPath: .status.conditions[?(@.type=="Ready")].status - name: Ready - type: string - - jsonPath: .metadata.creationTimestamp - name: Age - type: date - - jsonPath: .spec.role - name: Role - priority: 1 - type: string - name: v1 - schema: - openAPIV3Schema: - description: EC2NodeClass is the Schema for the EC2NodeClass API - properties: - apiVersion: - description: |- - APIVersion defines the versioned schema of this representation of an object. - Servers should convert recognized schemas to the latest internal value, and - may reject unrecognized values. - More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources - type: string - kind: - description: |- - Kind is a string value representing the REST resource this object represents. - Servers may infer this from the endpoint the client submits requests to. - Cannot be updated. - In CamelCase. - More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds - type: string - metadata: - type: object - spec: - description: |- - EC2NodeClassSpec is the top level specification for the AWS Karpenter Provider. - This will contain configuration necessary to launch instances in AWS. - properties: - amiFamily: + - additionalPrinterColumns: + - jsonPath: .status.conditions[?(@.type=="Ready")].status + name: Ready + type: string + - jsonPath: .metadata.creationTimestamp + name: Age + type: date + - jsonPath: .spec.role + name: Role + priority: 1 + type: string + name: v1 + schema: + openAPIV3Schema: + description: EC2NodeClass is the Schema for the EC2NodeClass API + properties: + apiVersion: + description: |- + APIVersion defines the versioned schema of this representation of an object. + Servers should convert recognized schemas to the latest internal value, and + may reject unrecognized values. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources + type: string + kind: + description: |- + Kind is a string value representing the REST resource this object represents. + Servers may infer this from the endpoint the client submits requests to. + Cannot be updated. + In CamelCase. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds + type: string + metadata: + type: object + spec: + description: |- + EC2NodeClassSpec is the top level specification for the AWS Karpenter Provider. + This will contain configuration necessary to launch instances in AWS. + properties: + amiFamily: + description: |- + AMIFamily dictates the UserData format and default BlockDeviceMappings used when generating launch templates. + This field is optional when using an alias amiSelectorTerm, and the value will be inferred from the alias' + family. When an alias is specified, this field may only be set to its corresponding family or 'Custom'. If no + alias is specified, this field is required. + NOTE: We ignore the AMIFamily for hashing here because we hash the AMIFamily dynamically by using the alias using + the AMIFamily() helper function + enum: + - AL2 + - AL2023 + - Bottlerocket + - Custom + - Windows2019 + - Windows2022 + type: string + amiSelectorTerms: + description: AMISelectorTerms is a list of or ami selector terms. + The terms are ORed. + items: description: |- - AMIFamily dictates the UserData format and default BlockDeviceMappings used when generating launch templates. - This field is optional when using an alias amiSelectorTerm, and the value will be inferred from the alias' - family. When an alias is specified, this field may only be set to its corresponding family or 'Custom'. If no - alias is specified, this field is required. - NOTE: We ignore the AMIFamily for hashing here because we hash the AMIFamily dynamically by using the alias using - the AMIFamily() helper function - enum: - - AL2 - - AL2023 - - Bottlerocket - - Custom - - Windows2019 - - Windows2022 - type: string - amiSelectorTerms: - description: AMISelectorTerms is a list of or ami selector terms. The terms are ORed. - items: - description: |- - AMISelectorTerm defines selection logic for an ami used by Karpenter to launch nodes. - If multiple fields are used for selection, the requirements are ANDed. - properties: - alias: - description: |- - Alias specifies which EKS optimized AMI to select. - Each alias consists of a family and an AMI version, specified as "family@version". - Valid families include: al2, al2023, bottlerocket, windows2019, and windows2022. - The version can either be pinned to a specific AMI release, with that AMIs version format (ex: "al2023@v20240625" or "bottlerocket@v1.10.0"). - The version can also be set to "latest" for any family. Setting the version to latest will result in drift when a new AMI is released. This is **not** recommended for production environments. - Note: The Windows families do **not** support version pinning, and only latest may be used. - maxLength: 30 - type: string - x-kubernetes-validations: - - message: '''alias'' is improperly formatted, must match the format ''family@version''' - rule: self.matches('^[a-zA-Z0-9]+@.+$') - - message: 'family is not supported, must be one of the following: ''al2'', ''al2023'', ''bottlerocket'', ''windows2019'', ''windows2022''' - rule: self.split('@')[0] in ['al2','al2023','bottlerocket','windows2019','windows2022'] - - message: windows families may only specify version 'latest' - rule: 'self.split(''@'')[0] in [''windows2019'',''windows2022''] ? self.split(''@'')[1] == ''latest'' : true' - id: - description: ID is the ami id in EC2 - pattern: ami-[0-9a-z]+ - type: string - name: - description: |- - Name is the ami name in EC2. - This value is the name field, which is different from the name tag. - type: string - owner: - description: |- - Owner is the owner for the ami. - You can specify a combination of AWS account IDs, "self", "amazon", and "aws-marketplace" - type: string - tags: - additionalProperties: - type: string - description: |- - Tags is a map of key/value tags used to select subnets - Specifying '*' for a value selects all values for a given tag key. - maxProperties: 20 - type: object - x-kubernetes-validations: - - message: empty tag keys or values aren't supported - rule: self.all(k, k != '' && self[k] != '') - type: object - maxItems: 30 - minItems: 1 - type: array - x-kubernetes-validations: - - message: expected at least one, got none, ['tags', 'id', 'name', 'alias'] - rule: self.all(x, has(x.tags) || has(x.id) || has(x.name) || has(x.alias)) - - message: '''id'' is mutually exclusive, cannot be set with a combination of other fields in amiSelectorTerms' - rule: '!self.exists(x, has(x.id) && (has(x.alias) || has(x.tags) || has(x.name) || has(x.owner)))' - - message: '''alias'' is mutually exclusive, cannot be set with a combination of other fields in amiSelectorTerms' - rule: '!self.exists(x, has(x.alias) && (has(x.id) || has(x.tags) || has(x.name) || has(x.owner)))' - - message: '''alias'' is mutually exclusive, cannot be set with a combination of other amiSelectorTerms' - rule: '!(self.exists(x, has(x.alias)) && self.size() != 1)' - associatePublicIPAddress: - description: AssociatePublicIPAddress controls if public IP addresses are assigned to instances that are launched with the nodeclass. - type: boolean - blockDeviceMappings: - description: BlockDeviceMappings to be applied to provisioned nodes. - items: - properties: - deviceName: - description: The device name (for example, /dev/sdh or xvdh). + AMISelectorTerm defines selection logic for an ami used by Karpenter to launch nodes. + If multiple fields are used for selection, the requirements are ANDed. + properties: + alias: + description: |- + Alias specifies which EKS optimized AMI to select. + Each alias consists of a family and an AMI version, specified as "family@version". + Valid families include: al2, al2023, bottlerocket, windows2019, and windows2022. + The version can either be pinned to a specific AMI release, with that AMIs version format (ex: "al2023@v20240625" or "bottlerocket@v1.10.0"). + The version can also be set to "latest" for any family. Setting the version to latest will result in drift when a new AMI is released. This is **not** recommended for production environments. + Note: The Windows families do **not** support version pinning, and only latest may be used. + maxLength: 30 + type: string + x-kubernetes-validations: + - message: '''alias'' is improperly formatted, must match the + format ''family@version''' + rule: self.matches('^[a-zA-Z0-9]+@.+$') + - message: 'family is not supported, must be one of the following: + ''al2'', ''al2023'', ''bottlerocket'', ''windows2019'', + ''windows2022''' + rule: self.split('@')[0] in ['al2','al2023','bottlerocket','windows2019','windows2022'] + - message: windows families may only specify version 'latest' + rule: 'self.split(''@'')[0] in [''windows2019'',''windows2022''] + ? self.split(''@'')[1] == ''latest'' : true' + id: + description: ID is the ami id in EC2 + pattern: ami-[0-9a-z]+ + type: string + name: + description: |- + Name is the ami name in EC2. + This value is the name field, which is different from the name tag. + type: string + owner: + description: |- + Owner is the owner for the ami. + You can specify a combination of AWS account IDs, "self", "amazon", and "aws-marketplace" + type: string + tags: + additionalProperties: type: string - ebs: - description: EBS contains parameters used to automatically set up EBS volumes when an instance is launched. - properties: - deleteOnTermination: - description: DeleteOnTermination indicates whether the EBS volume is deleted on instance termination. - type: boolean - encrypted: - description: |- - Encrypted indicates whether the EBS volume is encrypted. Encrypted volumes can only - be attached to instances that support Amazon EBS encryption. If you are creating - a volume from a snapshot, you can't specify an encryption value. - type: boolean - iops: - description: |- - IOPS is the number of I/O operations per second (IOPS). For gp3, io1, and io2 volumes, - this represents the number of IOPS that are provisioned for the volume. For - gp2 volumes, this represents the baseline performance of the volume and the - rate at which the volume accumulates I/O credits for bursting. + description: |- + Tags is a map of key/value tags used to select subnets + Specifying '*' for a value selects all values for a given tag key. + maxProperties: 20 + type: object + x-kubernetes-validations: + - message: empty tag keys or values aren't supported + rule: self.all(k, k != '' && self[k] != '') + type: object + maxItems: 30 + minItems: 1 + type: array + x-kubernetes-validations: + - message: expected at least one, got none, ['tags', 'id', 'name', + 'alias'] + rule: self.all(x, has(x.tags) || has(x.id) || has(x.name) || has(x.alias)) + - message: '''id'' is mutually exclusive, cannot be set with a combination + of other fields in amiSelectorTerms' + rule: '!self.exists(x, has(x.id) && (has(x.alias) || has(x.tags) + || has(x.name) || has(x.owner)))' + - message: '''alias'' is mutually exclusive, cannot be set with a + combination of other fields in amiSelectorTerms' + rule: '!self.exists(x, has(x.alias) && (has(x.id) || has(x.tags) + || has(x.name) || has(x.owner)))' + - message: '''alias'' is mutually exclusive, cannot be set with a + combination of other amiSelectorTerms' + rule: '!(self.exists(x, has(x.alias)) && self.size() != 1)' + associatePublicIPAddress: + description: AssociatePublicIPAddress controls if public IP addresses + are assigned to instances that are launched with the nodeclass. + type: boolean + blockDeviceMappings: + description: BlockDeviceMappings to be applied to provisioned nodes. + items: + properties: + deviceName: + description: The device name (for example, /dev/sdh or xvdh). + type: string + ebs: + description: EBS contains parameters used to automatically set + up EBS volumes when an instance is launched. + properties: + deleteOnTermination: + description: DeleteOnTermination indicates whether the EBS + volume is deleted on instance termination. + type: boolean + encrypted: + description: |- + Encrypted indicates whether the EBS volume is encrypted. Encrypted volumes can only + be attached to instances that support Amazon EBS encryption. If you are creating + a volume from a snapshot, you can't specify an encryption value. + type: boolean + iops: + description: |- + IOPS is the number of I/O operations per second (IOPS). For gp3, io1, and io2 volumes, + this represents the number of IOPS that are provisioned for the volume. For + gp2 volumes, this represents the baseline performance of the volume and the + rate at which the volume accumulates I/O credits for bursting. - The following are the supported values for each volume type: + The following are the supported values for each volume type: - * gp3: 3,000-16,000 IOPS + * gp3: 3,000-16,000 IOPS - * io1: 100-64,000 IOPS + * io1: 100-64,000 IOPS - * io2: 100-64,000 IOPS + * io2: 100-64,000 IOPS - For io1 and io2 volumes, we guarantee 64,000 IOPS only for Instances built - on the Nitro System (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances). - Other instance families guarantee performance up to 32,000 IOPS. + For io1 and io2 volumes, we guarantee 64,000 IOPS only for Instances built + on the Nitro System (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances). + Other instance families guarantee performance up to 32,000 IOPS. - This parameter is supported for io1, io2, and gp3 volumes only. This parameter - is not supported for gp2, st1, sc1, or standard volumes. - format: int64 - type: integer - kmsKeyID: - description: KMSKeyID (ARN) of the symmetric Key Management Service (KMS) CMK used for encryption. - type: string - snapshotID: - description: SnapshotID is the ID of an EBS snapshot - type: string - throughput: - description: |- - Throughput to provision for a gp3 volume, with a maximum of 1,000 MiB/s. - Valid Range: Minimum value of 125. Maximum value of 1000. - format: int64 - type: integer - volumeSize: - description: |- - VolumeSize in `Gi`, `G`, `Ti`, or `T`. You must specify either a snapshot ID or - a volume size. The following are the supported volumes sizes for each volume - type: + This parameter is supported for io1, io2, and gp3 volumes only. This parameter + is not supported for gp2, st1, sc1, or standard volumes. + format: int64 + type: integer + kmsKeyID: + description: KMSKeyID (ARN) of the symmetric Key Management + Service (KMS) CMK used for encryption. + type: string + snapshotID: + description: SnapshotID is the ID of an EBS snapshot + type: string + throughput: + description: |- + Throughput to provision for a gp3 volume, with a maximum of 1,000 MiB/s. + Valid Range: Minimum value of 125. Maximum value of 1000. + format: int64 + type: integer + volumeSize: + description: |- + VolumeSize in `Gi`, `G`, `Ti`, or `T`. You must specify either a snapshot ID or + a volume size. The following are the supported volumes sizes for each volume + type: - * gp2 and gp3: 1-16,384 + * gp2 and gp3: 1-16,384 - * io1 and io2: 4-16,384 + * io1 and io2: 4-16,384 - * st1 and sc1: 125-16,384 + * st1 and sc1: 125-16,384 - * standard: 1-1,024 - pattern: ^((?:[1-9][0-9]{0,3}|[1-4][0-9]{4}|[5][0-8][0-9]{3}|59000)Gi|(?:[1-9][0-9]{0,3}|[1-5][0-9]{4}|[6][0-3][0-9]{3}|64000)G|([1-9]||[1-5][0-7]|58)Ti|([1-9]||[1-5][0-9]|6[0-3]|64)T)$ - type: string - volumeType: - description: |- - VolumeType of the block device. - For more information, see Amazon EBS volume types (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) - in the Amazon Elastic Compute Cloud User Guide. - enum: - - standard - - io1 - - io2 - - gp2 - - sc1 - - st1 - - gp3 - type: string - type: object - x-kubernetes-validations: - - message: snapshotID or volumeSize must be defined - rule: has(self.snapshotID) || has(self.volumeSize) - rootVolume: - description: |- - RootVolume is a flag indicating if this device is mounted as kubelet root dir. You can - configure at most one root volume in BlockDeviceMappings. - type: boolean + * standard: 1-1,024 + pattern: ^((?:[1-9][0-9]{0,3}|[1-4][0-9]{4}|[5][0-8][0-9]{3}|59000)Gi|(?:[1-9][0-9]{0,3}|[1-5][0-9]{4}|[6][0-3][0-9]{3}|64000)G|([1-9]||[1-5][0-7]|58)Ti|([1-9]||[1-5][0-9]|6[0-3]|64)T)$ + type: string + volumeType: + description: |- + VolumeType of the block device. + For more information, see Amazon EBS volume types (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) + in the Amazon Elastic Compute Cloud User Guide. + enum: + - standard + - io1 + - io2 + - gp2 + - sc1 + - st1 + - gp3 + type: string + type: object + x-kubernetes-validations: + - message: snapshotID or volumeSize must be defined + rule: has(self.snapshotID) || has(self.volumeSize) + rootVolume: + description: |- + RootVolume is a flag indicating if this device is mounted as kubelet root dir. You can + configure at most one root volume in BlockDeviceMappings. + type: boolean + type: object + maxItems: 50 + type: array + x-kubernetes-validations: + - message: must have only one blockDeviceMappings with rootVolume + rule: self.filter(x, has(x.rootVolume)?x.rootVolume==true:false).size() + <= 1 + context: + description: |- + Context is a Reserved field in EC2 APIs + https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html + type: string + detailedMonitoring: + description: DetailedMonitoring controls if detailed monitoring is + enabled for instances that are launched + type: boolean + instanceProfile: + description: |- + InstanceProfile is the AWS entity that instances use. + This field is mutually exclusive from role. + The instance profile should already have a role assigned to it that Karpenter + has PassRole permission on for instance launch using this instanceProfile to succeed. + type: string + x-kubernetes-validations: + - message: instanceProfile cannot be empty + rule: self != '' + instanceStorePolicy: + description: InstanceStorePolicy specifies how to handle instance-store + disks. + enum: + - RAID0 + type: string + kubelet: + description: |- + Kubelet defines args to be used when configuring kubelet on provisioned nodes. + They are a subset of the upstream types, recognizing not all options may be supported. + Wherever possible, the types and names should reflect the upstream kubelet types. + properties: + clusterDNS: + description: |- + clusterDNS is a list of IP addresses for the cluster DNS server. + Note that not all providers may use all addresses. + items: + type: string + type: array + cpuCFSQuota: + description: CPUCFSQuota enables CPU CFS quota enforcement for + containers that specify CPU limits. + type: boolean + evictionHard: + additionalProperties: + type: string + description: EvictionHard is the map of signal names to quantities + that define hard eviction thresholds type: object - maxItems: 50 - type: array - x-kubernetes-validations: - - message: must have only one blockDeviceMappings with rootVolume - rule: self.filter(x, has(x.rootVolume)?x.rootVolume==true:false).size() <= 1 - context: - description: |- - Context is a Reserved field in EC2 APIs - https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html - type: string - detailedMonitoring: - description: DetailedMonitoring controls if detailed monitoring is enabled for instances that are launched - type: boolean - instanceProfile: - description: |- - InstanceProfile is the AWS entity that instances use. - This field is mutually exclusive from role. - The instance profile should already have a role assigned to it that Karpenter - has PassRole permission on for instance launch using this instanceProfile to succeed. - type: string - x-kubernetes-validations: - - message: instanceProfile cannot be empty - rule: self != '' - instanceStorePolicy: - description: InstanceStorePolicy specifies how to handle instance-store disks. - enum: - - RAID0 - type: string - kubelet: + x-kubernetes-validations: + - message: valid keys for evictionHard are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] + rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) + evictionMaxPodGracePeriod: + description: |- + EvictionMaxPodGracePeriod is the maximum allowed grace period (in seconds) to use when terminating pods in + response to soft eviction thresholds being met. + format: int32 + type: integer + evictionSoft: + additionalProperties: + type: string + description: EvictionSoft is the map of signal names to quantities + that define soft eviction thresholds + type: object + x-kubernetes-validations: + - message: valid keys for evictionSoft are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] + rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) + evictionSoftGracePeriod: + additionalProperties: + type: string + description: EvictionSoftGracePeriod is the map of signal names + to quantities that define grace periods for each eviction signal + type: object + x-kubernetes-validations: + - message: valid keys for evictionSoftGracePeriod are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] + rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) + imageGCHighThresholdPercent: + description: |- + ImageGCHighThresholdPercent is the percent of disk usage after which image + garbage collection is always run. The percent is calculated by dividing this + field value by 100, so this field must be between 0 and 100, inclusive. + When specified, the value must be greater than ImageGCLowThresholdPercent. + format: int32 + maximum: 100 + minimum: 0 + type: integer + imageGCLowThresholdPercent: + description: |- + ImageGCLowThresholdPercent is the percent of disk usage before which image + garbage collection is never run. Lowest disk usage to garbage collect to. + The percent is calculated by dividing this field value by 100, + so the field value must be between 0 and 100, inclusive. + When specified, the value must be less than imageGCHighThresholdPercent + format: int32 + maximum: 100 + minimum: 0 + type: integer + kubeReserved: + additionalProperties: + type: string + description: KubeReserved contains resources reserved for Kubernetes + system components. + type: object + x-kubernetes-validations: + - message: valid keys for kubeReserved are ['cpu','memory','ephemeral-storage','pid'] + rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' + || x=='pid') + - message: kubeReserved value cannot be a negative resource quantity + rule: self.all(x, !self[x].startsWith('-')) + maxPods: + description: |- + MaxPods is an override for the maximum number of pods that can run on + a worker node instance. + format: int32 + minimum: 0 + type: integer + podsPerCore: + description: |- + PodsPerCore is an override for the number of pods that can run on a worker node + instance based on the number of cpu cores. This value cannot exceed MaxPods, so, if + MaxPods is a lower value, that value will be used. + format: int32 + minimum: 0 + type: integer + systemReserved: + additionalProperties: + type: string + description: SystemReserved contains resources reserved for OS + system daemons and kernel memory. + type: object + x-kubernetes-validations: + - message: valid keys for systemReserved are ['cpu','memory','ephemeral-storage','pid'] + rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' + || x=='pid') + - message: systemReserved value cannot be a negative resource + quantity + rule: self.all(x, !self[x].startsWith('-')) + type: object + x-kubernetes-validations: + - message: imageGCHighThresholdPercent must be greater than imageGCLowThresholdPercent + rule: 'has(self.imageGCHighThresholdPercent) && has(self.imageGCLowThresholdPercent) + ? self.imageGCHighThresholdPercent > self.imageGCLowThresholdPercent : + true' + - message: evictionSoft OwnerKey does not have a matching evictionSoftGracePeriod + rule: has(self.evictionSoft) ? self.evictionSoft.all(e, (e in self.evictionSoftGracePeriod)):true + - message: evictionSoftGracePeriod OwnerKey does not have a matching + evictionSoft + rule: has(self.evictionSoftGracePeriod) ? self.evictionSoftGracePeriod.all(e, + (e in self.evictionSoft)):true + metadataOptions: + default: + httpEndpoint: enabled + httpProtocolIPv6: disabled + httpPutResponseHopLimit: 1 + httpTokens: required + description: |- + MetadataOptions for the generated launch template of provisioned nodes. + + This specifies the exposure of the Instance Metadata Service to + provisioned EC2 nodes. For more information, + see Instance Metadata and User Data + (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) + in the Amazon Elastic Compute Cloud User Guide. + + Refer to recommended, security best practices + (https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node) + for limiting exposure of Instance Metadata and User Data to pods. + If omitted, defaults to httpEndpoint enabled, with httpProtocolIPv6 + disabled, with httpPutResponseLimit of 1, and with httpTokens + required. + properties: + httpEndpoint: + default: enabled + description: |- + HTTPEndpoint enables or disables the HTTP metadata endpoint on provisioned + nodes. If metadata options is non-nil, but this parameter is not specified, + the default state is "enabled". + + If you specify a value of "disabled", instance metadata will not be accessible + on the node. + enum: + - enabled + - disabled + type: string + httpProtocolIPv6: + default: disabled + description: |- + HTTPProtocolIPv6 enables or disables the IPv6 endpoint for the instance metadata + service on provisioned nodes. If metadata options is non-nil, but this parameter + is not specified, the default state is "disabled". + enum: + - enabled + - disabled + type: string + httpPutResponseHopLimit: + default: 1 + description: |- + HTTPPutResponseHopLimit is the desired HTTP PUT response hop limit for + instance metadata requests. The larger the number, the further instance + metadata requests can travel. Possible values are integers from 1 to 64. + If metadata options is non-nil, but this parameter is not specified, the + default value is 1. + format: int64 + maximum: 64 + minimum: 1 + type: integer + httpTokens: + default: required + description: |- + HTTPTokens determines the state of token usage for instance metadata + requests. If metadata options is non-nil, but this parameter is not + specified, the default state is "required". + + If the state is optional, one can choose to retrieve instance metadata with + or without a signed token header on the request. If one retrieves the IAM + role credentials without a token, the version 1.0 role credentials are + returned. If one retrieves the IAM role credentials using a valid signed + token, the version 2.0 role credentials are returned. + + If the state is "required", one must send a signed token header with any + instance metadata retrieval requests. In this state, retrieving the IAM + role credentials always returns the version 2.0 credentials; the version + 1.0 credentials are not available. + enum: + - required + - optional + type: string + type: object + role: + description: |- + Role is the AWS identity that nodes use. This field is immutable. + This field is mutually exclusive from instanceProfile. + Marking this field as immutable avoids concerns around terminating managed instance profiles from running instances. + This field may be made mutable in the future, assuming the correct garbage collection and drift handling is implemented + for the old instance profiles on an update. + type: string + x-kubernetes-validations: + - message: role cannot be empty + rule: self != '' + - message: immutable field changed + rule: self == oldSelf + securityGroupSelectorTerms: + description: SecurityGroupSelectorTerms is a list of or security group + selector terms. The terms are ORed. + items: description: |- - Kubelet defines args to be used when configuring kubelet on provisioned nodes. - They are a subset of the upstream types, recognizing not all options may be supported. - Wherever possible, the types and names should reflect the upstream kubelet types. + SecurityGroupSelectorTerm defines selection logic for a security group used by Karpenter to launch nodes. + If multiple fields are used for selection, the requirements are ANDed. properties: - clusterDNS: + id: + description: ID is the security group id in EC2 + pattern: sg-[0-9a-z]+ + type: string + name: description: |- - clusterDNS is a list of IP addresses for the cluster DNS server. - Note that not all providers may use all addresses. - items: - type: string - type: array - cpuCFSQuota: - description: CPUCFSQuota enables CPU CFS quota enforcement for containers that specify CPU limits. - type: boolean - evictionHard: + Name is the security group name in EC2. + This value is the name field, which is different from the name tag. + type: string + tags: additionalProperties: type: string - pattern: ^((\d{1,2}(\.\d{1,2})?|100(\.0{1,2})?)%||(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?)$ - description: EvictionHard is the map of signal names to quantities that define hard eviction thresholds - type: object - x-kubernetes-validations: - - message: valid keys for evictionHard are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] - rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) - evictionMaxPodGracePeriod: description: |- - EvictionMaxPodGracePeriod is the maximum allowed grace period (in seconds) to use when terminating pods in - response to soft eviction thresholds being met. - format: int32 - type: integer - evictionSoft: - additionalProperties: - type: string - pattern: ^((\d{1,2}(\.\d{1,2})?|100(\.0{1,2})?)%||(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?)$ - description: EvictionSoft is the map of signal names to quantities that define soft eviction thresholds - type: object - x-kubernetes-validations: - - message: valid keys for evictionSoft are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] - rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) - evictionSoftGracePeriod: - additionalProperties: - type: string - description: EvictionSoftGracePeriod is the map of signal names to quantities that define grace periods for each eviction signal + Tags is a map of key/value tags used to select subnets + Specifying '*' for a value selects all values for a given tag key. + maxProperties: 20 type: object x-kubernetes-validations: - - message: valid keys for evictionSoftGracePeriod are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] - rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) - imageGCHighThresholdPercent: - description: |- - ImageGCHighThresholdPercent is the percent of disk usage after which image - garbage collection is always run. The percent is calculated by dividing this - field value by 100, so this field must be between 0 and 100, inclusive. - When specified, the value must be greater than ImageGCLowThresholdPercent. - format: int32 - maximum: 100 - minimum: 0 - type: integer - imageGCLowThresholdPercent: - description: |- - ImageGCLowThresholdPercent is the percent of disk usage before which image - garbage collection is never run. Lowest disk usage to garbage collect to. - The percent is calculated by dividing this field value by 100, - so the field value must be between 0 and 100, inclusive. - When specified, the value must be less than imageGCHighThresholdPercent - format: int32 - maximum: 100 - minimum: 0 - type: integer - kubeReserved: + - message: empty tag keys or values aren't supported + rule: self.all(k, k != '' && self[k] != '') + type: object + maxItems: 30 + type: array + x-kubernetes-validations: + - message: securityGroupSelectorTerms cannot be empty + rule: self.size() != 0 + - message: expected at least one, got none, ['tags', 'id', 'name'] + rule: self.all(x, has(x.tags) || has(x.id) || has(x.name)) + - message: '''id'' is mutually exclusive, cannot be set with a combination + of other fields in securityGroupSelectorTerms' + rule: '!self.all(x, has(x.id) && (has(x.tags) || has(x.name)))' + - message: '''name'' is mutually exclusive, cannot be set with a combination + of other fields in securityGroupSelectorTerms' + rule: '!self.all(x, has(x.name) && (has(x.tags) || has(x.id)))' + subnetSelectorTerms: + description: SubnetSelectorTerms is a list of or subnet selector terms. + The terms are ORed. + items: + description: |- + SubnetSelectorTerm defines selection logic for a subnet used by Karpenter to launch nodes. + If multiple fields are used for selection, the requirements are ANDed. + properties: + id: + description: ID is the subnet id in EC2 + pattern: subnet-[0-9a-z]+ + type: string + tags: additionalProperties: type: string - pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ - description: KubeReserved contains resources reserved for Kubernetes system components. - type: object - x-kubernetes-validations: - - message: valid keys for kubeReserved are ['cpu','memory','ephemeral-storage','pid'] - rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' || x=='pid') - - message: kubeReserved value cannot be a negative resource quantity - rule: self.all(x, !self[x].startsWith('-')) - maxPods: - description: |- - MaxPods is an override for the maximum number of pods that can run on - a worker node instance. - format: int32 - minimum: 0 - type: integer - podsPerCore: description: |- - PodsPerCore is an override for the number of pods that can run on a worker node - instance based on the number of cpu cores. This value cannot exceed MaxPods, so, if - MaxPods is a lower value, that value will be used. - format: int32 - minimum: 0 - type: integer - systemReserved: - additionalProperties: - type: string - pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ - description: SystemReserved contains resources reserved for OS system daemons and kernel memory. + Tags is a map of key/value tags used to select subnets + Specifying '*' for a value selects all values for a given tag key. + maxProperties: 20 type: object x-kubernetes-validations: - - message: valid keys for systemReserved are ['cpu','memory','ephemeral-storage','pid'] - rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' || x=='pid') - - message: systemReserved value cannot be a negative resource quantity - rule: self.all(x, !self[x].startsWith('-')) + - message: empty tag keys or values aren't supported + rule: self.all(k, k != '' && self[k] != '') type: object - x-kubernetes-validations: - - message: imageGCHighThresholdPercent must be greater than imageGCLowThresholdPercent - rule: 'has(self.imageGCHighThresholdPercent) && has(self.imageGCLowThresholdPercent) ? self.imageGCHighThresholdPercent > self.imageGCLowThresholdPercent : true' - - message: evictionSoft OwnerKey does not have a matching evictionSoftGracePeriod - rule: has(self.evictionSoft) ? self.evictionSoft.all(e, (e in self.evictionSoftGracePeriod)):true - - message: evictionSoftGracePeriod OwnerKey does not have a matching evictionSoft - rule: has(self.evictionSoftGracePeriod) ? self.evictionSoftGracePeriod.all(e, (e in self.evictionSoft)):true - metadataOptions: - default: - httpEndpoint: enabled - httpProtocolIPv6: disabled - httpPutResponseHopLimit: 1 - httpTokens: required - description: |- - MetadataOptions for the generated launch template of provisioned nodes. - - This specifies the exposure of the Instance Metadata Service to - provisioned EC2 nodes. For more information, - see Instance Metadata and User Data - (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) - in the Amazon Elastic Compute Cloud User Guide. - - Refer to recommended, security best practices - (https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node) - for limiting exposure of Instance Metadata and User Data to pods. - If omitted, defaults to httpEndpoint enabled, with httpProtocolIPv6 - disabled, with httpPutResponseLimit of 1, and with httpTokens - required. + maxItems: 30 + type: array + x-kubernetes-validations: + - message: subnetSelectorTerms cannot be empty + rule: self.size() != 0 + - message: expected at least one, got none, ['tags', 'id'] + rule: self.all(x, has(x.tags) || has(x.id)) + - message: '''id'' is mutually exclusive, cannot be set with a combination + of other fields in subnetSelectorTerms' + rule: '!self.all(x, has(x.id) && has(x.tags))' + tags: + additionalProperties: + type: string + description: Tags to be applied on ec2 resources like instances and + launch templates. + type: object + x-kubernetes-validations: + - message: empty tag keys aren't supported + rule: self.all(k, k != '') + - message: tag contains a restricted tag matching eks:eks-cluster-name + rule: self.all(k, k !='eks:eks-cluster-name') + - message: tag contains a restricted tag matching kubernetes.io/cluster/ + rule: self.all(k, !k.startsWith('kubernetes.io/cluster') ) + - message: tag contains a restricted tag matching karpenter.sh/nodepool + rule: self.all(k, k != 'karpenter.sh/nodepool') + - message: tag contains a restricted tag matching karpenter.sh/nodeclaim + rule: self.all(k, k !='karpenter.sh/nodeclaim') + - message: tag contains a restricted tag matching karpenter.k8s.aws/ec2nodeclass + rule: self.all(k, k !='karpenter.k8s.aws/ec2nodeclass') + userData: + description: |- + UserData to be applied to the provisioned nodes. + It must be in the appropriate format based on the AMIFamily in use. Karpenter will merge certain fields into + this UserData to ensure nodes are being provisioned with the correct configuration. + type: string + required: + - amiSelectorTerms + - securityGroupSelectorTerms + - subnetSelectorTerms + type: object + x-kubernetes-validations: + - message: must specify exactly one of ['role', 'instanceProfile'] + rule: (has(self.role) && !has(self.instanceProfile)) || (!has(self.role) + && has(self.instanceProfile)) + - message: changing from 'instanceProfile' to 'role' is not supported. + You must delete and recreate this node class if you want to change + this. + rule: (has(oldSelf.role) && has(self.role)) || (has(oldSelf.instanceProfile) + && has(self.instanceProfile)) + - message: if set, amiFamily must be 'AL2' or 'Custom' when using an AL2 + alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) + && x.alias.find(''^[^@]+'') == ''al2'') ? (self.amiFamily == ''Custom'' + || self.amiFamily == ''AL2'') : true)' + - message: if set, amiFamily must be 'AL2023' or 'Custom' when using an + AL2023 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) + && x.alias.find(''^[^@]+'') == ''al2023'') ? (self.amiFamily == ''Custom'' + || self.amiFamily == ''AL2023'') : true)' + - message: if set, amiFamily must be 'Bottlerocket' or 'Custom' when using + a Bottlerocket alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) + && x.alias.find(''^[^@]+'') == ''bottlerocket'') ? (self.amiFamily + == ''Custom'' || self.amiFamily == ''Bottlerocket'') : true)' + - message: if set, amiFamily must be 'Windows2019' or 'Custom' when using + a Windows2019 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) + && x.alias.find(''^[^@]+'') == ''windows2019'') ? (self.amiFamily + == ''Custom'' || self.amiFamily == ''Windows2019'') : true)' + - message: if set, amiFamily must be 'Windows2022' or 'Custom' when using + a Windows2022 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) + && x.alias.find(''^[^@]+'') == ''windows2022'') ? (self.amiFamily + == ''Custom'' || self.amiFamily == ''Windows2022'') : true)' + - message: must specify amiFamily if amiSelectorTerms does not contain + an alias + rule: 'self.amiSelectorTerms.exists(x, has(x.alias)) ? true : has(self.amiFamily)' + status: + description: EC2NodeClassStatus contains the resolved state of the EC2NodeClass + properties: + amis: + description: |- + AMI contains the current AMI values that are available to the + cluster under the AMI selectors. + items: + description: AMI contains resolved AMI selector values utilized + for node launch properties: - httpEndpoint: - default: enabled + id: + description: ID of the AMI + type: string + name: + description: Name of the AMI + type: string + requirements: + description: Requirements of the AMI to be utilized on an instance + type + items: + description: |- + A node selector requirement is a selector that contains values, a key, and an operator + that relates the key and values. + properties: + key: + description: The label key that the selector applies to. + type: string + operator: + description: |- + Represents a key's relationship to a set of values. + Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. + type: string + values: + description: |- + An array of string values. If the operator is In or NotIn, + the values array must be non-empty. If the operator is Exists or DoesNotExist, + the values array must be empty. If the operator is Gt or Lt, the values + array must have a single element, which will be interpreted as an integer. + This array is replaced during a strategic merge patch. + items: + type: string + type: array + x-kubernetes-list-type: atomic + required: + - key + - operator + type: object + type: array + required: + - id + - requirements + type: object + type: array + conditions: + description: Conditions contains signals for health and readiness + items: + description: Condition aliases the upstream type and adds additional + helper methods + properties: + lastTransitionTime: description: |- - HTTPEndpoint enables or disables the HTTP metadata endpoint on provisioned - nodes. If metadata options is non-nil, but this parameter is not specified, - the default state is "enabled". - - If you specify a value of "disabled", instance metadata will not be accessible - on the node. - enum: - - enabled - - disabled + lastTransitionTime is the last time the condition transitioned from one status to another. + This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. + format: date-time type: string - httpProtocolIPv6: - default: disabled + message: description: |- - HTTPProtocolIPv6 enables or disables the IPv6 endpoint for the instance metadata - service on provisioned nodes. If metadata options is non-nil, but this parameter - is not specified, the default state is "disabled". - enum: - - enabled - - disabled + message is a human readable message indicating details about the transition. + This may be an empty string. + maxLength: 32768 type: string - httpPutResponseHopLimit: - default: 1 + observedGeneration: description: |- - HTTPPutResponseHopLimit is the desired HTTP PUT response hop limit for - instance metadata requests. The larger the number, the further instance - metadata requests can travel. Possible values are integers from 1 to 64. - If metadata options is non-nil, but this parameter is not specified, the - default value is 1. + observedGeneration represents the .metadata.generation that the condition was set based upon. + For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date + with respect to the current state of the instance. format: int64 - maximum: 64 - minimum: 1 + minimum: 0 type: integer - httpTokens: - default: required + reason: description: |- - HTTPTokens determines the state of token usage for instance metadata - requests. If metadata options is non-nil, but this parameter is not - specified, the default state is "required". - - If the state is optional, one can choose to retrieve instance metadata with - or without a signed token header on the request. If one retrieves the IAM - role credentials without a token, the version 1.0 role credentials are - returned. If one retrieves the IAM role credentials using a valid signed - token, the version 2.0 role credentials are returned. - - If the state is "required", one must send a signed token header with any - instance metadata retrieval requests. In this state, retrieving the IAM - role credentials always returns the version 2.0 credentials; the version - 1.0 credentials are not available. + reason contains a programmatic identifier indicating the reason for the condition's last transition. + Producers of specific condition types may define expected values and meanings for this field, + and whether the values are considered a guaranteed API. + The value should be a CamelCase string. + This field may not be empty. + maxLength: 1024 + minLength: 1 + pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ + type: string + status: + description: status of the condition, one of True, False, Unknown. enum: - - required - - optional + - "True" + - "False" + - Unknown + type: string + type: + description: type of condition in CamelCase or in foo.example.com/CamelCase. + maxLength: 316 + pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ type: string + required: + - lastTransitionTime + - message + - reason + - status + - type type: object - role: - description: |- - Role is the AWS identity that nodes use. This field is immutable. - This field is mutually exclusive from instanceProfile. - Marking this field as immutable avoids concerns around terminating managed instance profiles from running instances. - This field may be made mutable in the future, assuming the correct garbage collection and drift handling is implemented - for the old instance profiles on an update. - type: string - x-kubernetes-validations: - - message: role cannot be empty - rule: self != '' - - message: immutable field changed - rule: self == oldSelf - securityGroupSelectorTerms: - description: SecurityGroupSelectorTerms is a list of or security group selector terms. The terms are ORed. - items: - description: |- - SecurityGroupSelectorTerm defines selection logic for a security group used by Karpenter to launch nodes. - If multiple fields are used for selection, the requirements are ANDed. - properties: - id: - description: ID is the security group id in EC2 - pattern: sg-[0-9a-z]+ - type: string - name: - description: |- - Name is the security group name in EC2. - This value is the name field, which is different from the name tag. - type: string - tags: - additionalProperties: - type: string - description: |- - Tags is a map of key/value tags used to select subnets - Specifying '*' for a value selects all values for a given tag key. - maxProperties: 20 - type: object - x-kubernetes-validations: - - message: empty tag keys or values aren't supported - rule: self.all(k, k != '' && self[k] != '') - type: object - maxItems: 30 - type: array - x-kubernetes-validations: - - message: securityGroupSelectorTerms cannot be empty - rule: self.size() != 0 - - message: expected at least one, got none, ['tags', 'id', 'name'] - rule: self.all(x, has(x.tags) || has(x.id) || has(x.name)) - - message: '''id'' is mutually exclusive, cannot be set with a combination of other fields in securityGroupSelectorTerms' - rule: '!self.all(x, has(x.id) && (has(x.tags) || has(x.name)))' - - message: '''name'' is mutually exclusive, cannot be set with a combination of other fields in securityGroupSelectorTerms' - rule: '!self.all(x, has(x.name) && (has(x.tags) || has(x.id)))' - subnetSelectorTerms: - description: SubnetSelectorTerms is a list of or subnet selector terms. The terms are ORed. - items: - description: |- - SubnetSelectorTerm defines selection logic for a subnet used by Karpenter to launch nodes. - If multiple fields are used for selection, the requirements are ANDed. - properties: - id: - description: ID is the subnet id in EC2 - pattern: subnet-[0-9a-z]+ - type: string - tags: - additionalProperties: - type: string - description: |- - Tags is a map of key/value tags used to select subnets - Specifying '*' for a value selects all values for a given tag key. - maxProperties: 20 - type: object - x-kubernetes-validations: - - message: empty tag keys or values aren't supported - rule: self.all(k, k != '' && self[k] != '') - type: object - maxItems: 30 - type: array - x-kubernetes-validations: - - message: subnetSelectorTerms cannot be empty - rule: self.size() != 0 - - message: expected at least one, got none, ['tags', 'id'] - rule: self.all(x, has(x.tags) || has(x.id)) - - message: '''id'' is mutually exclusive, cannot be set with a combination of other fields in subnetSelectorTerms' - rule: '!self.all(x, has(x.id) && has(x.tags))' - tags: - additionalProperties: - type: string - description: Tags to be applied on ec2 resources like instances and launch templates. + type: array + instanceProfile: + description: InstanceProfile contains the resolved instance profile + for the role + type: string + securityGroups: + description: |- + SecurityGroups contains the current Security Groups values that are available to the + cluster under the SecurityGroups selectors. + items: + description: SecurityGroup contains resolved SecurityGroup selector + values utilized for node launch + properties: + id: + description: ID of the security group + type: string + name: + description: Name of the security group + type: string + required: + - id type: object - x-kubernetes-validations: - - message: empty tag keys aren't supported - rule: self.all(k, k != '') - - message: tag contains a restricted tag matching eks:eks-cluster-name - rule: self.all(k, k !='eks:eks-cluster-name') - - message: tag contains a restricted tag matching kubernetes.io/cluster/ - rule: self.all(k, !k.startsWith('kubernetes.io/cluster') ) - - message: tag contains a restricted tag matching karpenter.sh/nodepool - rule: self.all(k, k != 'karpenter.sh/nodepool') - - message: tag contains a restricted tag matching karpenter.sh/nodeclaim - rule: self.all(k, k !='karpenter.sh/nodeclaim') - - message: tag contains a restricted tag matching karpenter.k8s.aws/ec2nodeclass - rule: self.all(k, k !='karpenter.k8s.aws/ec2nodeclass') - userData: - description: |- - UserData to be applied to the provisioned nodes. - It must be in the appropriate format based on the AMIFamily in use. Karpenter will merge certain fields into - this UserData to ensure nodes are being provisioned with the correct configuration. - type: string - required: - - amiSelectorTerms - - securityGroupSelectorTerms - - subnetSelectorTerms - type: object - x-kubernetes-validations: - - message: must specify exactly one of ['role', 'instanceProfile'] - rule: (has(self.role) && !has(self.instanceProfile)) || (!has(self.role) && has(self.instanceProfile)) - - message: changing from 'instanceProfile' to 'role' is not supported. You must delete and recreate this node class if you want to change this. - rule: (has(oldSelf.role) && has(self.role)) || (has(oldSelf.instanceProfile) && has(self.instanceProfile)) - - message: if set, amiFamily must be 'AL2' or 'Custom' when using an AL2 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''al2'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''AL2'') : true)' - - message: if set, amiFamily must be 'AL2023' or 'Custom' when using an AL2023 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''al2023'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''AL2023'') : true)' - - message: if set, amiFamily must be 'Bottlerocket' or 'Custom' when using a Bottlerocket alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''bottlerocket'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''Bottlerocket'') : true)' - - message: if set, amiFamily must be 'Windows2019' or 'Custom' when using a Windows2019 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''windows2019'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''Windows2019'') : true)' - - message: if set, amiFamily must be 'Windows2022' or 'Custom' when using a Windows2022 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''windows2022'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''Windows2022'') : true)' - - message: must specify amiFamily if amiSelectorTerms does not contain an alias - rule: 'self.amiSelectorTerms.exists(x, has(x.alias)) ? true : has(self.amiFamily)' - status: - description: EC2NodeClassStatus contains the resolved state of the EC2NodeClass - properties: - amis: - description: |- - AMI contains the current AMI values that are available to the - cluster under the AMI selectors. - items: - description: AMI contains resolved AMI selector values utilized for node launch - properties: - id: - description: ID of the AMI - type: string - name: - description: Name of the AMI - type: string - requirements: - description: Requirements of the AMI to be utilized on an instance type - items: - description: |- - A node selector requirement is a selector that contains values, a key, and an operator - that relates the key and values. - properties: - key: - description: The label key that the selector applies to. - type: string - operator: - description: |- - Represents a key's relationship to a set of values. - Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. - type: string - values: - description: |- - An array of string values. If the operator is In or NotIn, - the values array must be non-empty. If the operator is Exists or DoesNotExist, - the values array must be empty. If the operator is Gt or Lt, the values - array must have a single element, which will be interpreted as an integer. - This array is replaced during a strategic merge patch. - items: - type: string - type: array - x-kubernetes-list-type: atomic - required: - - key - - operator - type: object - type: array - required: - - id - - requirements - type: object - type: array - conditions: - description: Conditions contains signals for health and readiness - items: - description: Condition aliases the upstream type and adds additional helper methods - properties: - lastTransitionTime: - description: |- - lastTransitionTime is the last time the condition transitioned from one status to another. - This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. - format: date-time - type: string - message: - description: |- - message is a human readable message indicating details about the transition. - This may be an empty string. - maxLength: 32768 - type: string - observedGeneration: - description: |- - observedGeneration represents the .metadata.generation that the condition was set based upon. - For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date - with respect to the current state of the instance. - format: int64 - minimum: 0 - type: integer - reason: - description: |- - reason contains a programmatic identifier indicating the reason for the condition's last transition. - Producers of specific condition types may define expected values and meanings for this field, - and whether the values are considered a guaranteed API. - The value should be a CamelCase string. - This field may not be empty. - maxLength: 1024 - minLength: 1 - pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ - type: string - status: - description: status of the condition, one of True, False, Unknown. - enum: - - "True" - - "False" - - Unknown - type: string - type: - description: type of condition in CamelCase or in foo.example.com/CamelCase. - maxLength: 316 - pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ - type: string - required: - - lastTransitionTime - - message - - reason - - status - - type - type: object - type: array - instanceProfile: - description: InstanceProfile contains the resolved instance profile for the role - type: string - securityGroups: - description: |- - SecurityGroups contains the current Security Groups values that are available to the - cluster under the SecurityGroups selectors. - items: - description: SecurityGroup contains resolved SecurityGroup selector values utilized for node launch - properties: - id: - description: ID of the security group - type: string - name: - description: Name of the security group - type: string - required: - - id - type: object - type: array - subnets: - description: |- - Subnets contains the current Subnet values that are available to the - cluster under the subnet selectors. - items: - description: Subnet contains resolved Subnet selector values utilized for node launch - properties: - id: - description: ID of the subnet - type: string - zone: - description: The associated availability zone - type: string - zoneID: - description: The associated availability zone ID - type: string - required: - - id - - zone - type: object - type: array - type: object - type: object - served: true - storage: true - subresources: - status: {} + type: array + subnets: + description: |- + Subnets contains the current Subnet values that are available to the + cluster under the subnet selectors. + items: + description: Subnet contains resolved Subnet selector values utilized + for node launch + properties: + id: + description: ID of the subnet + type: string + zone: + description: The associated availability zone + type: string + zoneID: + description: The associated availability zone ID + type: string + required: + - id + - zone + type: object + type: array + type: object + type: object + served: true + storage: true + subresources: + status: {} diff --git a/pkg/apis/crds/karpenter.sh_nodeclaims.yaml b/pkg/apis/crds/karpenter.sh_nodeclaims.yaml index 02fa4861acf5..e70b6e2af752 100644 --- a/pkg/apis/crds/karpenter.sh_nodeclaims.yaml +++ b/pkg/apis/crds/karpenter.sh_nodeclaims.yaml @@ -120,8 +120,6 @@ spec: rule: self in ["karpenter.sh/capacity-type", "karpenter.sh/nodepool"] || !self.find("^([^/]+)").endsWith("karpenter.sh") - message: label "kubernetes.io/hostname" is restricted rule: self != "kubernetes.io/hostname" - - message: label domain "karpenter.k8s.aws" is restricted - rule: self in ["karpenter.k8s.aws/instance-encryption-in-transit-supported", "karpenter.k8s.aws/instance-category", "karpenter.k8s.aws/instance-hypervisor", "karpenter.k8s.aws/instance-family", "karpenter.k8s.aws/instance-generation", "karpenter.k8s.aws/instance-local-nvme", "karpenter.k8s.aws/instance-size", "karpenter.k8s.aws/instance-cpu","karpenter.k8s.aws/instance-cpu-manufacturer","karpenter.k8s.aws/instance-memory", "karpenter.k8s.aws/instance-ebs-bandwidth", "karpenter.k8s.aws/instance-network-bandwidth", "karpenter.k8s.aws/instance-gpu-name", "karpenter.k8s.aws/instance-gpu-manufacturer", "karpenter.k8s.aws/instance-gpu-count", "karpenter.k8s.aws/instance-gpu-memory", "karpenter.k8s.aws/instance-accelerator-name", "karpenter.k8s.aws/instance-accelerator-manufacturer", "karpenter.k8s.aws/instance-accelerator-count"] || !self.find("^([^/]+)").endsWith("karpenter.k8s.aws") minValues: description: |- This field is ALPHA and can be dropped or replaced at any time diff --git a/pkg/apis/crds/karpenter.sh_nodepools.yaml b/pkg/apis/crds/karpenter.sh_nodepools.yaml index 0894d11feecb..a22d8befeb52 100644 --- a/pkg/apis/crds/karpenter.sh_nodepools.yaml +++ b/pkg/apis/crds/karpenter.sh_nodepools.yaml @@ -208,8 +208,6 @@ spec: rule: self.all(x, x != "karpenter.sh/nodepool") - message: label "kubernetes.io/hostname" is restricted rule: self.all(x, x != "kubernetes.io/hostname") - - message: label domain "karpenter.k8s.aws" is restricted - rule: self.all(x, x in ["karpenter.k8s.aws/instance-encryption-in-transit-supported", "karpenter.k8s.aws/instance-category", "karpenter.k8s.aws/instance-hypervisor", "karpenter.k8s.aws/instance-family", "karpenter.k8s.aws/instance-generation", "karpenter.k8s.aws/instance-local-nvme", "karpenter.k8s.aws/instance-size", "karpenter.k8s.aws/instance-cpu","karpenter.k8s.aws/instance-cpu-manufacturer","karpenter.k8s.aws/instance-memory", "karpenter.k8s.aws/instance-ebs-bandwidth", "karpenter.k8s.aws/instance-network-bandwidth", "karpenter.k8s.aws/instance-gpu-name", "karpenter.k8s.aws/instance-gpu-manufacturer", "karpenter.k8s.aws/instance-gpu-count", "karpenter.k8s.aws/instance-gpu-memory", "karpenter.k8s.aws/instance-accelerator-name", "karpenter.k8s.aws/instance-accelerator-manufacturer", "karpenter.k8s.aws/instance-accelerator-count"] || !x.find("^([^/]+)").endsWith("karpenter.k8s.aws")) type: object spec: description: |- @@ -267,8 +265,6 @@ spec: rule: self != "karpenter.sh/nodepool" - message: label "kubernetes.io/hostname" is restricted rule: self != "kubernetes.io/hostname" - - message: label domain "karpenter.k8s.aws" is restricted - rule: self in ["karpenter.k8s.aws/instance-encryption-in-transit-supported", "karpenter.k8s.aws/instance-category", "karpenter.k8s.aws/instance-hypervisor", "karpenter.k8s.aws/instance-family", "karpenter.k8s.aws/instance-generation", "karpenter.k8s.aws/instance-local-nvme", "karpenter.k8s.aws/instance-size", "karpenter.k8s.aws/instance-cpu","karpenter.k8s.aws/instance-cpu-manufacturer","karpenter.k8s.aws/instance-memory", "karpenter.k8s.aws/instance-ebs-bandwidth", "karpenter.k8s.aws/instance-network-bandwidth", "karpenter.k8s.aws/instance-gpu-name", "karpenter.k8s.aws/instance-gpu-manufacturer", "karpenter.k8s.aws/instance-gpu-count", "karpenter.k8s.aws/instance-gpu-memory", "karpenter.k8s.aws/instance-accelerator-name", "karpenter.k8s.aws/instance-accelerator-manufacturer", "karpenter.k8s.aws/instance-accelerator-count"] || !self.find("^([^/]+)").endsWith("karpenter.k8s.aws") minValues: description: |- This field is ALPHA and can be dropped or replaced at any time From 45c3e77b5dba6fb9fd30249bf0ff0894e0074a82 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Fri, 18 Oct 2024 19:37:36 -0500 Subject: [PATCH 13/15] chore: Remove auto changes made by `make` - no clue what this stuff is or doing --- .../karpenter.k8s.aws_ec2nodeclasses.yaml | 2 +- .../karpenter.k8s.aws_ec2nodeclasses.yaml | 1423 ++++++++--------- pkg/apis/crds/karpenter.sh_nodeclaims.yaml | 2 + pkg/apis/crds/karpenter.sh_nodepools.yaml | 4 + 4 files changed, 689 insertions(+), 742 deletions(-) diff --git a/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml b/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml index abd370251f5c..47901f77f660 100644 --- a/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml +++ b/charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: - controller-gen.kubebuilder.io/version: v0.16.4 + controller-gen.kubebuilder.io/version: v0.16.3 name: ec2nodeclasses.karpenter.k8s.aws spec: group: karpenter.k8s.aws diff --git a/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml b/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml index 857e89c65326..47901f77f660 100644 --- a/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml +++ b/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml @@ -3,796 +3,737 @@ apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: - controller-gen.kubebuilder.io/version: v0.16.4 + controller-gen.kubebuilder.io/version: v0.16.3 name: ec2nodeclasses.karpenter.k8s.aws spec: group: karpenter.k8s.aws names: categories: - - karpenter + - karpenter kind: EC2NodeClass listKind: EC2NodeClassList plural: ec2nodeclasses shortNames: - - ec2nc - - ec2ncs + - ec2nc + - ec2ncs singular: ec2nodeclass scope: Cluster versions: - - additionalPrinterColumns: - - jsonPath: .status.conditions[?(@.type=="Ready")].status - name: Ready - type: string - - jsonPath: .metadata.creationTimestamp - name: Age - type: date - - jsonPath: .spec.role - name: Role - priority: 1 - type: string - name: v1 - schema: - openAPIV3Schema: - description: EC2NodeClass is the Schema for the EC2NodeClass API - properties: - apiVersion: - description: |- - APIVersion defines the versioned schema of this representation of an object. - Servers should convert recognized schemas to the latest internal value, and - may reject unrecognized values. - More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources - type: string - kind: - description: |- - Kind is a string value representing the REST resource this object represents. - Servers may infer this from the endpoint the client submits requests to. - Cannot be updated. - In CamelCase. - More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds - type: string - metadata: - type: object - spec: - description: |- - EC2NodeClassSpec is the top level specification for the AWS Karpenter Provider. - This will contain configuration necessary to launch instances in AWS. - properties: - amiFamily: - description: |- - AMIFamily dictates the UserData format and default BlockDeviceMappings used when generating launch templates. - This field is optional when using an alias amiSelectorTerm, and the value will be inferred from the alias' - family. When an alias is specified, this field may only be set to its corresponding family or 'Custom'. If no - alias is specified, this field is required. - NOTE: We ignore the AMIFamily for hashing here because we hash the AMIFamily dynamically by using the alias using - the AMIFamily() helper function - enum: - - AL2 - - AL2023 - - Bottlerocket - - Custom - - Windows2019 - - Windows2022 - type: string - amiSelectorTerms: - description: AMISelectorTerms is a list of or ami selector terms. - The terms are ORed. - items: + - additionalPrinterColumns: + - jsonPath: .status.conditions[?(@.type=="Ready")].status + name: Ready + type: string + - jsonPath: .metadata.creationTimestamp + name: Age + type: date + - jsonPath: .spec.role + name: Role + priority: 1 + type: string + name: v1 + schema: + openAPIV3Schema: + description: EC2NodeClass is the Schema for the EC2NodeClass API + properties: + apiVersion: + description: |- + APIVersion defines the versioned schema of this representation of an object. + Servers should convert recognized schemas to the latest internal value, and + may reject unrecognized values. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources + type: string + kind: + description: |- + Kind is a string value representing the REST resource this object represents. + Servers may infer this from the endpoint the client submits requests to. + Cannot be updated. + In CamelCase. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds + type: string + metadata: + type: object + spec: + description: |- + EC2NodeClassSpec is the top level specification for the AWS Karpenter Provider. + This will contain configuration necessary to launch instances in AWS. + properties: + amiFamily: description: |- - AMISelectorTerm defines selection logic for an ami used by Karpenter to launch nodes. - If multiple fields are used for selection, the requirements are ANDed. - properties: - alias: - description: |- - Alias specifies which EKS optimized AMI to select. - Each alias consists of a family and an AMI version, specified as "family@version". - Valid families include: al2, al2023, bottlerocket, windows2019, and windows2022. - The version can either be pinned to a specific AMI release, with that AMIs version format (ex: "al2023@v20240625" or "bottlerocket@v1.10.0"). - The version can also be set to "latest" for any family. Setting the version to latest will result in drift when a new AMI is released. This is **not** recommended for production environments. - Note: The Windows families do **not** support version pinning, and only latest may be used. - maxLength: 30 - type: string - x-kubernetes-validations: - - message: '''alias'' is improperly formatted, must match the - format ''family@version''' - rule: self.matches('^[a-zA-Z0-9]+@.+$') - - message: 'family is not supported, must be one of the following: - ''al2'', ''al2023'', ''bottlerocket'', ''windows2019'', - ''windows2022''' - rule: self.split('@')[0] in ['al2','al2023','bottlerocket','windows2019','windows2022'] - - message: windows families may only specify version 'latest' - rule: 'self.split(''@'')[0] in [''windows2019'',''windows2022''] - ? self.split(''@'')[1] == ''latest'' : true' - id: - description: ID is the ami id in EC2 - pattern: ami-[0-9a-z]+ - type: string - name: - description: |- - Name is the ami name in EC2. - This value is the name field, which is different from the name tag. - type: string - owner: - description: |- - Owner is the owner for the ami. - You can specify a combination of AWS account IDs, "self", "amazon", and "aws-marketplace" - type: string - tags: - additionalProperties: + AMIFamily dictates the UserData format and default BlockDeviceMappings used when generating launch templates. + This field is optional when using an alias amiSelectorTerm, and the value will be inferred from the alias' + family. When an alias is specified, this field may only be set to its corresponding family or 'Custom'. If no + alias is specified, this field is required. + NOTE: We ignore the AMIFamily for hashing here because we hash the AMIFamily dynamically by using the alias using + the AMIFamily() helper function + enum: + - AL2 + - AL2023 + - Bottlerocket + - Custom + - Windows2019 + - Windows2022 + type: string + amiSelectorTerms: + description: AMISelectorTerms is a list of or ami selector terms. The terms are ORed. + items: + description: |- + AMISelectorTerm defines selection logic for an ami used by Karpenter to launch nodes. + If multiple fields are used for selection, the requirements are ANDed. + properties: + alias: + description: |- + Alias specifies which EKS optimized AMI to select. + Each alias consists of a family and an AMI version, specified as "family@version". + Valid families include: al2, al2023, bottlerocket, windows2019, and windows2022. + The version can either be pinned to a specific AMI release, with that AMIs version format (ex: "al2023@v20240625" or "bottlerocket@v1.10.0"). + The version can also be set to "latest" for any family. Setting the version to latest will result in drift when a new AMI is released. This is **not** recommended for production environments. + Note: The Windows families do **not** support version pinning, and only latest may be used. + maxLength: 30 type: string - description: |- - Tags is a map of key/value tags used to select subnets - Specifying '*' for a value selects all values for a given tag key. - maxProperties: 20 - type: object - x-kubernetes-validations: - - message: empty tag keys or values aren't supported - rule: self.all(k, k != '' && self[k] != '') - type: object - maxItems: 30 - minItems: 1 - type: array - x-kubernetes-validations: - - message: expected at least one, got none, ['tags', 'id', 'name', - 'alias'] - rule: self.all(x, has(x.tags) || has(x.id) || has(x.name) || has(x.alias)) - - message: '''id'' is mutually exclusive, cannot be set with a combination - of other fields in amiSelectorTerms' - rule: '!self.exists(x, has(x.id) && (has(x.alias) || has(x.tags) - || has(x.name) || has(x.owner)))' - - message: '''alias'' is mutually exclusive, cannot be set with a - combination of other fields in amiSelectorTerms' - rule: '!self.exists(x, has(x.alias) && (has(x.id) || has(x.tags) - || has(x.name) || has(x.owner)))' - - message: '''alias'' is mutually exclusive, cannot be set with a - combination of other amiSelectorTerms' - rule: '!(self.exists(x, has(x.alias)) && self.size() != 1)' - associatePublicIPAddress: - description: AssociatePublicIPAddress controls if public IP addresses - are assigned to instances that are launched with the nodeclass. - type: boolean - blockDeviceMappings: - description: BlockDeviceMappings to be applied to provisioned nodes. - items: - properties: - deviceName: - description: The device name (for example, /dev/sdh or xvdh). - type: string - ebs: - description: EBS contains parameters used to automatically set - up EBS volumes when an instance is launched. - properties: - deleteOnTermination: - description: DeleteOnTermination indicates whether the EBS - volume is deleted on instance termination. - type: boolean - encrypted: - description: |- - Encrypted indicates whether the EBS volume is encrypted. Encrypted volumes can only - be attached to instances that support Amazon EBS encryption. If you are creating - a volume from a snapshot, you can't specify an encryption value. - type: boolean - iops: - description: |- - IOPS is the number of I/O operations per second (IOPS). For gp3, io1, and io2 volumes, - this represents the number of IOPS that are provisioned for the volume. For - gp2 volumes, this represents the baseline performance of the volume and the - rate at which the volume accumulates I/O credits for bursting. + x-kubernetes-validations: + - message: '''alias'' is improperly formatted, must match the format ''family@version''' + rule: self.matches('^[a-zA-Z0-9]+@.+$') + - message: 'family is not supported, must be one of the following: ''al2'', ''al2023'', ''bottlerocket'', ''windows2019'', ''windows2022''' + rule: self.split('@')[0] in ['al2','al2023','bottlerocket','windows2019','windows2022'] + - message: windows families may only specify version 'latest' + rule: 'self.split(''@'')[0] in [''windows2019'',''windows2022''] ? self.split(''@'')[1] == ''latest'' : true' + id: + description: ID is the ami id in EC2 + pattern: ami-[0-9a-z]+ + type: string + name: + description: |- + Name is the ami name in EC2. + This value is the name field, which is different from the name tag. + type: string + owner: + description: |- + Owner is the owner for the ami. + You can specify a combination of AWS account IDs, "self", "amazon", and "aws-marketplace" + type: string + tags: + additionalProperties: + type: string + description: |- + Tags is a map of key/value tags used to select subnets + Specifying '*' for a value selects all values for a given tag key. + maxProperties: 20 + type: object + x-kubernetes-validations: + - message: empty tag keys or values aren't supported + rule: self.all(k, k != '' && self[k] != '') + type: object + maxItems: 30 + minItems: 1 + type: array + x-kubernetes-validations: + - message: expected at least one, got none, ['tags', 'id', 'name', 'alias'] + rule: self.all(x, has(x.tags) || has(x.id) || has(x.name) || has(x.alias)) + - message: '''id'' is mutually exclusive, cannot be set with a combination of other fields in amiSelectorTerms' + rule: '!self.exists(x, has(x.id) && (has(x.alias) || has(x.tags) || has(x.name) || has(x.owner)))' + - message: '''alias'' is mutually exclusive, cannot be set with a combination of other fields in amiSelectorTerms' + rule: '!self.exists(x, has(x.alias) && (has(x.id) || has(x.tags) || has(x.name) || has(x.owner)))' + - message: '''alias'' is mutually exclusive, cannot be set with a combination of other amiSelectorTerms' + rule: '!(self.exists(x, has(x.alias)) && self.size() != 1)' + associatePublicIPAddress: + description: AssociatePublicIPAddress controls if public IP addresses are assigned to instances that are launched with the nodeclass. + type: boolean + blockDeviceMappings: + description: BlockDeviceMappings to be applied to provisioned nodes. + items: + properties: + deviceName: + description: The device name (for example, /dev/sdh or xvdh). + type: string + ebs: + description: EBS contains parameters used to automatically set up EBS volumes when an instance is launched. + properties: + deleteOnTermination: + description: DeleteOnTermination indicates whether the EBS volume is deleted on instance termination. + type: boolean + encrypted: + description: |- + Encrypted indicates whether the EBS volume is encrypted. Encrypted volumes can only + be attached to instances that support Amazon EBS encryption. If you are creating + a volume from a snapshot, you can't specify an encryption value. + type: boolean + iops: + description: |- + IOPS is the number of I/O operations per second (IOPS). For gp3, io1, and io2 volumes, + this represents the number of IOPS that are provisioned for the volume. For + gp2 volumes, this represents the baseline performance of the volume and the + rate at which the volume accumulates I/O credits for bursting. - The following are the supported values for each volume type: + The following are the supported values for each volume type: - * gp3: 3,000-16,000 IOPS + * gp3: 3,000-16,000 IOPS - * io1: 100-64,000 IOPS + * io1: 100-64,000 IOPS - * io2: 100-64,000 IOPS + * io2: 100-64,000 IOPS - For io1 and io2 volumes, we guarantee 64,000 IOPS only for Instances built - on the Nitro System (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances). - Other instance families guarantee performance up to 32,000 IOPS. + For io1 and io2 volumes, we guarantee 64,000 IOPS only for Instances built + on the Nitro System (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances). + Other instance families guarantee performance up to 32,000 IOPS. - This parameter is supported for io1, io2, and gp3 volumes only. This parameter - is not supported for gp2, st1, sc1, or standard volumes. - format: int64 - type: integer - kmsKeyID: - description: KMSKeyID (ARN) of the symmetric Key Management - Service (KMS) CMK used for encryption. - type: string - snapshotID: - description: SnapshotID is the ID of an EBS snapshot - type: string - throughput: - description: |- - Throughput to provision for a gp3 volume, with a maximum of 1,000 MiB/s. - Valid Range: Minimum value of 125. Maximum value of 1000. - format: int64 - type: integer - volumeSize: - description: |- - VolumeSize in `Gi`, `G`, `Ti`, or `T`. You must specify either a snapshot ID or - a volume size. The following are the supported volumes sizes for each volume - type: + This parameter is supported for io1, io2, and gp3 volumes only. This parameter + is not supported for gp2, st1, sc1, or standard volumes. + format: int64 + type: integer + kmsKeyID: + description: KMSKeyID (ARN) of the symmetric Key Management Service (KMS) CMK used for encryption. + type: string + snapshotID: + description: SnapshotID is the ID of an EBS snapshot + type: string + throughput: + description: |- + Throughput to provision for a gp3 volume, with a maximum of 1,000 MiB/s. + Valid Range: Minimum value of 125. Maximum value of 1000. + format: int64 + type: integer + volumeSize: + description: |- + VolumeSize in `Gi`, `G`, `Ti`, or `T`. You must specify either a snapshot ID or + a volume size. The following are the supported volumes sizes for each volume + type: - * gp2 and gp3: 1-16,384 + * gp2 and gp3: 1-16,384 - * io1 and io2: 4-16,384 + * io1 and io2: 4-16,384 - * st1 and sc1: 125-16,384 + * st1 and sc1: 125-16,384 - * standard: 1-1,024 - pattern: ^((?:[1-9][0-9]{0,3}|[1-4][0-9]{4}|[5][0-8][0-9]{3}|59000)Gi|(?:[1-9][0-9]{0,3}|[1-5][0-9]{4}|[6][0-3][0-9]{3}|64000)G|([1-9]||[1-5][0-7]|58)Ti|([1-9]||[1-5][0-9]|6[0-3]|64)T)$ - type: string - volumeType: - description: |- - VolumeType of the block device. - For more information, see Amazon EBS volume types (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) - in the Amazon Elastic Compute Cloud User Guide. - enum: - - standard - - io1 - - io2 - - gp2 - - sc1 - - st1 - - gp3 - type: string - type: object - x-kubernetes-validations: - - message: snapshotID or volumeSize must be defined - rule: has(self.snapshotID) || has(self.volumeSize) - rootVolume: - description: |- - RootVolume is a flag indicating if this device is mounted as kubelet root dir. You can - configure at most one root volume in BlockDeviceMappings. - type: boolean - type: object - maxItems: 50 - type: array - x-kubernetes-validations: - - message: must have only one blockDeviceMappings with rootVolume - rule: self.filter(x, has(x.rootVolume)?x.rootVolume==true:false).size() - <= 1 - context: - description: |- - Context is a Reserved field in EC2 APIs - https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html - type: string - detailedMonitoring: - description: DetailedMonitoring controls if detailed monitoring is - enabled for instances that are launched - type: boolean - instanceProfile: - description: |- - InstanceProfile is the AWS entity that instances use. - This field is mutually exclusive from role. - The instance profile should already have a role assigned to it that Karpenter - has PassRole permission on for instance launch using this instanceProfile to succeed. - type: string - x-kubernetes-validations: - - message: instanceProfile cannot be empty - rule: self != '' - instanceStorePolicy: - description: InstanceStorePolicy specifies how to handle instance-store - disks. - enum: - - RAID0 - type: string - kubelet: - description: |- - Kubelet defines args to be used when configuring kubelet on provisioned nodes. - They are a subset of the upstream types, recognizing not all options may be supported. - Wherever possible, the types and names should reflect the upstream kubelet types. - properties: - clusterDNS: - description: |- - clusterDNS is a list of IP addresses for the cluster DNS server. - Note that not all providers may use all addresses. - items: - type: string - type: array - cpuCFSQuota: - description: CPUCFSQuota enables CPU CFS quota enforcement for - containers that specify CPU limits. - type: boolean - evictionHard: - additionalProperties: - type: string - description: EvictionHard is the map of signal names to quantities - that define hard eviction thresholds - type: object - x-kubernetes-validations: - - message: valid keys for evictionHard are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] - rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) - evictionMaxPodGracePeriod: - description: |- - EvictionMaxPodGracePeriod is the maximum allowed grace period (in seconds) to use when terminating pods in - response to soft eviction thresholds being met. - format: int32 - type: integer - evictionSoft: - additionalProperties: - type: string - description: EvictionSoft is the map of signal names to quantities - that define soft eviction thresholds - type: object - x-kubernetes-validations: - - message: valid keys for evictionSoft are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] - rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) - evictionSoftGracePeriod: - additionalProperties: - type: string - description: EvictionSoftGracePeriod is the map of signal names - to quantities that define grace periods for each eviction signal - type: object - x-kubernetes-validations: - - message: valid keys for evictionSoftGracePeriod are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] - rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) - imageGCHighThresholdPercent: - description: |- - ImageGCHighThresholdPercent is the percent of disk usage after which image - garbage collection is always run. The percent is calculated by dividing this - field value by 100, so this field must be between 0 and 100, inclusive. - When specified, the value must be greater than ImageGCLowThresholdPercent. - format: int32 - maximum: 100 - minimum: 0 - type: integer - imageGCLowThresholdPercent: - description: |- - ImageGCLowThresholdPercent is the percent of disk usage before which image - garbage collection is never run. Lowest disk usage to garbage collect to. - The percent is calculated by dividing this field value by 100, - so the field value must be between 0 and 100, inclusive. - When specified, the value must be less than imageGCHighThresholdPercent - format: int32 - maximum: 100 - minimum: 0 - type: integer - kubeReserved: - additionalProperties: - type: string - description: KubeReserved contains resources reserved for Kubernetes - system components. - type: object - x-kubernetes-validations: - - message: valid keys for kubeReserved are ['cpu','memory','ephemeral-storage','pid'] - rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' - || x=='pid') - - message: kubeReserved value cannot be a negative resource quantity - rule: self.all(x, !self[x].startsWith('-')) - maxPods: - description: |- - MaxPods is an override for the maximum number of pods that can run on - a worker node instance. - format: int32 - minimum: 0 - type: integer - podsPerCore: - description: |- - PodsPerCore is an override for the number of pods that can run on a worker node - instance based on the number of cpu cores. This value cannot exceed MaxPods, so, if - MaxPods is a lower value, that value will be used. - format: int32 - minimum: 0 - type: integer - systemReserved: - additionalProperties: - type: string - description: SystemReserved contains resources reserved for OS - system daemons and kernel memory. + * standard: 1-1,024 + pattern: ^((?:[1-9][0-9]{0,3}|[1-4][0-9]{4}|[5][0-8][0-9]{3}|59000)Gi|(?:[1-9][0-9]{0,3}|[1-5][0-9]{4}|[6][0-3][0-9]{3}|64000)G|([1-9]||[1-5][0-7]|58)Ti|([1-9]||[1-5][0-9]|6[0-3]|64)T)$ + type: string + volumeType: + description: |- + VolumeType of the block device. + For more information, see Amazon EBS volume types (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) + in the Amazon Elastic Compute Cloud User Guide. + enum: + - standard + - io1 + - io2 + - gp2 + - sc1 + - st1 + - gp3 + type: string + type: object + x-kubernetes-validations: + - message: snapshotID or volumeSize must be defined + rule: has(self.snapshotID) || has(self.volumeSize) + rootVolume: + description: |- + RootVolume is a flag indicating if this device is mounted as kubelet root dir. You can + configure at most one root volume in BlockDeviceMappings. + type: boolean type: object - x-kubernetes-validations: - - message: valid keys for systemReserved are ['cpu','memory','ephemeral-storage','pid'] - rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' - || x=='pid') - - message: systemReserved value cannot be a negative resource - quantity - rule: self.all(x, !self[x].startsWith('-')) - type: object - x-kubernetes-validations: - - message: imageGCHighThresholdPercent must be greater than imageGCLowThresholdPercent - rule: 'has(self.imageGCHighThresholdPercent) && has(self.imageGCLowThresholdPercent) - ? self.imageGCHighThresholdPercent > self.imageGCLowThresholdPercent : - true' - - message: evictionSoft OwnerKey does not have a matching evictionSoftGracePeriod - rule: has(self.evictionSoft) ? self.evictionSoft.all(e, (e in self.evictionSoftGracePeriod)):true - - message: evictionSoftGracePeriod OwnerKey does not have a matching - evictionSoft - rule: has(self.evictionSoftGracePeriod) ? self.evictionSoftGracePeriod.all(e, - (e in self.evictionSoft)):true - metadataOptions: - default: - httpEndpoint: enabled - httpProtocolIPv6: disabled - httpPutResponseHopLimit: 1 - httpTokens: required - description: |- - MetadataOptions for the generated launch template of provisioned nodes. - - This specifies the exposure of the Instance Metadata Service to - provisioned EC2 nodes. For more information, - see Instance Metadata and User Data - (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) - in the Amazon Elastic Compute Cloud User Guide. - - Refer to recommended, security best practices - (https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node) - for limiting exposure of Instance Metadata and User Data to pods. - If omitted, defaults to httpEndpoint enabled, with httpProtocolIPv6 - disabled, with httpPutResponseLimit of 1, and with httpTokens - required. - properties: - httpEndpoint: - default: enabled - description: |- - HTTPEndpoint enables or disables the HTTP metadata endpoint on provisioned - nodes. If metadata options is non-nil, but this parameter is not specified, - the default state is "enabled". - - If you specify a value of "disabled", instance metadata will not be accessible - on the node. - enum: - - enabled - - disabled - type: string - httpProtocolIPv6: - default: disabled - description: |- - HTTPProtocolIPv6 enables or disables the IPv6 endpoint for the instance metadata - service on provisioned nodes. If metadata options is non-nil, but this parameter - is not specified, the default state is "disabled". - enum: - - enabled - - disabled - type: string - httpPutResponseHopLimit: - default: 1 - description: |- - HTTPPutResponseHopLimit is the desired HTTP PUT response hop limit for - instance metadata requests. The larger the number, the further instance - metadata requests can travel. Possible values are integers from 1 to 64. - If metadata options is non-nil, but this parameter is not specified, the - default value is 1. - format: int64 - maximum: 64 - minimum: 1 - type: integer - httpTokens: - default: required - description: |- - HTTPTokens determines the state of token usage for instance metadata - requests. If metadata options is non-nil, but this parameter is not - specified, the default state is "required". - - If the state is optional, one can choose to retrieve instance metadata with - or without a signed token header on the request. If one retrieves the IAM - role credentials without a token, the version 1.0 role credentials are - returned. If one retrieves the IAM role credentials using a valid signed - token, the version 2.0 role credentials are returned. - - If the state is "required", one must send a signed token header with any - instance metadata retrieval requests. In this state, retrieving the IAM - role credentials always returns the version 2.0 credentials; the version - 1.0 credentials are not available. - enum: - - required - - optional - type: string - type: object - role: - description: |- - Role is the AWS identity that nodes use. This field is immutable. - This field is mutually exclusive from instanceProfile. - Marking this field as immutable avoids concerns around terminating managed instance profiles from running instances. - This field may be made mutable in the future, assuming the correct garbage collection and drift handling is implemented - for the old instance profiles on an update. - type: string - x-kubernetes-validations: - - message: role cannot be empty - rule: self != '' - - message: immutable field changed - rule: self == oldSelf - securityGroupSelectorTerms: - description: SecurityGroupSelectorTerms is a list of or security group - selector terms. The terms are ORed. - items: + maxItems: 50 + type: array + x-kubernetes-validations: + - message: must have only one blockDeviceMappings with rootVolume + rule: self.filter(x, has(x.rootVolume)?x.rootVolume==true:false).size() <= 1 + context: description: |- - SecurityGroupSelectorTerm defines selection logic for a security group used by Karpenter to launch nodes. - If multiple fields are used for selection, the requirements are ANDed. + Context is a Reserved field in EC2 APIs + https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html + type: string + detailedMonitoring: + description: DetailedMonitoring controls if detailed monitoring is enabled for instances that are launched + type: boolean + instanceProfile: + description: |- + InstanceProfile is the AWS entity that instances use. + This field is mutually exclusive from role. + The instance profile should already have a role assigned to it that Karpenter + has PassRole permission on for instance launch using this instanceProfile to succeed. + type: string + x-kubernetes-validations: + - message: instanceProfile cannot be empty + rule: self != '' + instanceStorePolicy: + description: InstanceStorePolicy specifies how to handle instance-store disks. + enum: + - RAID0 + type: string + kubelet: + description: |- + Kubelet defines args to be used when configuring kubelet on provisioned nodes. + They are a subset of the upstream types, recognizing not all options may be supported. + Wherever possible, the types and names should reflect the upstream kubelet types. properties: - id: - description: ID is the security group id in EC2 - pattern: sg-[0-9a-z]+ - type: string - name: + clusterDNS: description: |- - Name is the security group name in EC2. - This value is the name field, which is different from the name tag. - type: string - tags: + clusterDNS is a list of IP addresses for the cluster DNS server. + Note that not all providers may use all addresses. + items: + type: string + type: array + cpuCFSQuota: + description: CPUCFSQuota enables CPU CFS quota enforcement for containers that specify CPU limits. + type: boolean + evictionHard: additionalProperties: type: string + pattern: ^((\d{1,2}(\.\d{1,2})?|100(\.0{1,2})?)%||(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?)$ + description: EvictionHard is the map of signal names to quantities that define hard eviction thresholds + type: object + x-kubernetes-validations: + - message: valid keys for evictionHard are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] + rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) + evictionMaxPodGracePeriod: description: |- - Tags is a map of key/value tags used to select subnets - Specifying '*' for a value selects all values for a given tag key. - maxProperties: 20 + EvictionMaxPodGracePeriod is the maximum allowed grace period (in seconds) to use when terminating pods in + response to soft eviction thresholds being met. + format: int32 + type: integer + evictionSoft: + additionalProperties: + type: string + pattern: ^((\d{1,2}(\.\d{1,2})?|100(\.0{1,2})?)%||(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?)$ + description: EvictionSoft is the map of signal names to quantities that define soft eviction thresholds type: object x-kubernetes-validations: - - message: empty tag keys or values aren't supported - rule: self.all(k, k != '' && self[k] != '') - type: object - maxItems: 30 - type: array - x-kubernetes-validations: - - message: securityGroupSelectorTerms cannot be empty - rule: self.size() != 0 - - message: expected at least one, got none, ['tags', 'id', 'name'] - rule: self.all(x, has(x.tags) || has(x.id) || has(x.name)) - - message: '''id'' is mutually exclusive, cannot be set with a combination - of other fields in securityGroupSelectorTerms' - rule: '!self.all(x, has(x.id) && (has(x.tags) || has(x.name)))' - - message: '''name'' is mutually exclusive, cannot be set with a combination - of other fields in securityGroupSelectorTerms' - rule: '!self.all(x, has(x.name) && (has(x.tags) || has(x.id)))' - subnetSelectorTerms: - description: SubnetSelectorTerms is a list of or subnet selector terms. - The terms are ORed. - items: - description: |- - SubnetSelectorTerm defines selection logic for a subnet used by Karpenter to launch nodes. - If multiple fields are used for selection, the requirements are ANDed. - properties: - id: - description: ID is the subnet id in EC2 - pattern: subnet-[0-9a-z]+ - type: string - tags: + - message: valid keys for evictionSoft are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] + rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) + evictionSoftGracePeriod: additionalProperties: type: string + description: EvictionSoftGracePeriod is the map of signal names to quantities that define grace periods for each eviction signal + type: object + x-kubernetes-validations: + - message: valid keys for evictionSoftGracePeriod are ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available'] + rule: self.all(x, x in ['memory.available','nodefs.available','nodefs.inodesFree','imagefs.available','imagefs.inodesFree','pid.available']) + imageGCHighThresholdPercent: + description: |- + ImageGCHighThresholdPercent is the percent of disk usage after which image + garbage collection is always run. The percent is calculated by dividing this + field value by 100, so this field must be between 0 and 100, inclusive. + When specified, the value must be greater than ImageGCLowThresholdPercent. + format: int32 + maximum: 100 + minimum: 0 + type: integer + imageGCLowThresholdPercent: description: |- - Tags is a map of key/value tags used to select subnets - Specifying '*' for a value selects all values for a given tag key. - maxProperties: 20 + ImageGCLowThresholdPercent is the percent of disk usage before which image + garbage collection is never run. Lowest disk usage to garbage collect to. + The percent is calculated by dividing this field value by 100, + so the field value must be between 0 and 100, inclusive. + When specified, the value must be less than imageGCHighThresholdPercent + format: int32 + maximum: 100 + minimum: 0 + type: integer + kubeReserved: + additionalProperties: + type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + description: KubeReserved contains resources reserved for Kubernetes system components. type: object x-kubernetes-validations: - - message: empty tag keys or values aren't supported - rule: self.all(k, k != '' && self[k] != '') - type: object - maxItems: 30 - type: array - x-kubernetes-validations: - - message: subnetSelectorTerms cannot be empty - rule: self.size() != 0 - - message: expected at least one, got none, ['tags', 'id'] - rule: self.all(x, has(x.tags) || has(x.id)) - - message: '''id'' is mutually exclusive, cannot be set with a combination - of other fields in subnetSelectorTerms' - rule: '!self.all(x, has(x.id) && has(x.tags))' - tags: - additionalProperties: - type: string - description: Tags to be applied on ec2 resources like instances and - launch templates. - type: object - x-kubernetes-validations: - - message: empty tag keys aren't supported - rule: self.all(k, k != '') - - message: tag contains a restricted tag matching eks:eks-cluster-name - rule: self.all(k, k !='eks:eks-cluster-name') - - message: tag contains a restricted tag matching kubernetes.io/cluster/ - rule: self.all(k, !k.startsWith('kubernetes.io/cluster') ) - - message: tag contains a restricted tag matching karpenter.sh/nodepool - rule: self.all(k, k != 'karpenter.sh/nodepool') - - message: tag contains a restricted tag matching karpenter.sh/nodeclaim - rule: self.all(k, k !='karpenter.sh/nodeclaim') - - message: tag contains a restricted tag matching karpenter.k8s.aws/ec2nodeclass - rule: self.all(k, k !='karpenter.k8s.aws/ec2nodeclass') - userData: - description: |- - UserData to be applied to the provisioned nodes. - It must be in the appropriate format based on the AMIFamily in use. Karpenter will merge certain fields into - this UserData to ensure nodes are being provisioned with the correct configuration. - type: string - required: - - amiSelectorTerms - - securityGroupSelectorTerms - - subnetSelectorTerms - type: object - x-kubernetes-validations: - - message: must specify exactly one of ['role', 'instanceProfile'] - rule: (has(self.role) && !has(self.instanceProfile)) || (!has(self.role) - && has(self.instanceProfile)) - - message: changing from 'instanceProfile' to 'role' is not supported. - You must delete and recreate this node class if you want to change - this. - rule: (has(oldSelf.role) && has(self.role)) || (has(oldSelf.instanceProfile) - && has(self.instanceProfile)) - - message: if set, amiFamily must be 'AL2' or 'Custom' when using an AL2 - alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) - && x.alias.find(''^[^@]+'') == ''al2'') ? (self.amiFamily == ''Custom'' - || self.amiFamily == ''AL2'') : true)' - - message: if set, amiFamily must be 'AL2023' or 'Custom' when using an - AL2023 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) - && x.alias.find(''^[^@]+'') == ''al2023'') ? (self.amiFamily == ''Custom'' - || self.amiFamily == ''AL2023'') : true)' - - message: if set, amiFamily must be 'Bottlerocket' or 'Custom' when using - a Bottlerocket alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) - && x.alias.find(''^[^@]+'') == ''bottlerocket'') ? (self.amiFamily - == ''Custom'' || self.amiFamily == ''Bottlerocket'') : true)' - - message: if set, amiFamily must be 'Windows2019' or 'Custom' when using - a Windows2019 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) - && x.alias.find(''^[^@]+'') == ''windows2019'') ? (self.amiFamily - == ''Custom'' || self.amiFamily == ''Windows2019'') : true)' - - message: if set, amiFamily must be 'Windows2022' or 'Custom' when using - a Windows2022 alias - rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) - && x.alias.find(''^[^@]+'') == ''windows2022'') ? (self.amiFamily - == ''Custom'' || self.amiFamily == ''Windows2022'') : true)' - - message: must specify amiFamily if amiSelectorTerms does not contain - an alias - rule: 'self.amiSelectorTerms.exists(x, has(x.alias)) ? true : has(self.amiFamily)' - status: - description: EC2NodeClassStatus contains the resolved state of the EC2NodeClass - properties: - amis: - description: |- - AMI contains the current AMI values that are available to the - cluster under the AMI selectors. - items: - description: AMI contains resolved AMI selector values utilized - for node launch - properties: - id: - description: ID of the AMI - type: string - name: - description: Name of the AMI - type: string - requirements: - description: Requirements of the AMI to be utilized on an instance - type - items: - description: |- - A node selector requirement is a selector that contains values, a key, and an operator - that relates the key and values. - properties: - key: - description: The label key that the selector applies to. - type: string - operator: - description: |- - Represents a key's relationship to a set of values. - Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. - type: string - values: - description: |- - An array of string values. If the operator is In or NotIn, - the values array must be non-empty. If the operator is Exists or DoesNotExist, - the values array must be empty. If the operator is Gt or Lt, the values - array must have a single element, which will be interpreted as an integer. - This array is replaced during a strategic merge patch. - items: - type: string - type: array - x-kubernetes-list-type: atomic - required: - - key - - operator - type: object - type: array - required: - - id - - requirements + - message: valid keys for kubeReserved are ['cpu','memory','ephemeral-storage','pid'] + rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' || x=='pid') + - message: kubeReserved value cannot be a negative resource quantity + rule: self.all(x, !self[x].startsWith('-')) + maxPods: + description: |- + MaxPods is an override for the maximum number of pods that can run on + a worker node instance. + format: int32 + minimum: 0 + type: integer + podsPerCore: + description: |- + PodsPerCore is an override for the number of pods that can run on a worker node + instance based on the number of cpu cores. This value cannot exceed MaxPods, so, if + MaxPods is a lower value, that value will be used. + format: int32 + minimum: 0 + type: integer + systemReserved: + additionalProperties: + type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + description: SystemReserved contains resources reserved for OS system daemons and kernel memory. + type: object + x-kubernetes-validations: + - message: valid keys for systemReserved are ['cpu','memory','ephemeral-storage','pid'] + rule: self.all(x, x=='cpu' || x=='memory' || x=='ephemeral-storage' || x=='pid') + - message: systemReserved value cannot be a negative resource quantity + rule: self.all(x, !self[x].startsWith('-')) type: object - type: array - conditions: - description: Conditions contains signals for health and readiness - items: - description: Condition aliases the upstream type and adds additional - helper methods + x-kubernetes-validations: + - message: imageGCHighThresholdPercent must be greater than imageGCLowThresholdPercent + rule: 'has(self.imageGCHighThresholdPercent) && has(self.imageGCLowThresholdPercent) ? self.imageGCHighThresholdPercent > self.imageGCLowThresholdPercent : true' + - message: evictionSoft OwnerKey does not have a matching evictionSoftGracePeriod + rule: has(self.evictionSoft) ? self.evictionSoft.all(e, (e in self.evictionSoftGracePeriod)):true + - message: evictionSoftGracePeriod OwnerKey does not have a matching evictionSoft + rule: has(self.evictionSoftGracePeriod) ? self.evictionSoftGracePeriod.all(e, (e in self.evictionSoft)):true + metadataOptions: + default: + httpEndpoint: enabled + httpProtocolIPv6: disabled + httpPutResponseHopLimit: 1 + httpTokens: required + description: |- + MetadataOptions for the generated launch template of provisioned nodes. + + This specifies the exposure of the Instance Metadata Service to + provisioned EC2 nodes. For more information, + see Instance Metadata and User Data + (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) + in the Amazon Elastic Compute Cloud User Guide. + + Refer to recommended, security best practices + (https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node) + for limiting exposure of Instance Metadata and User Data to pods. + If omitted, defaults to httpEndpoint enabled, with httpProtocolIPv6 + disabled, with httpPutResponseLimit of 1, and with httpTokens + required. properties: - lastTransitionTime: + httpEndpoint: + default: enabled description: |- - lastTransitionTime is the last time the condition transitioned from one status to another. - This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. - format: date-time + HTTPEndpoint enables or disables the HTTP metadata endpoint on provisioned + nodes. If metadata options is non-nil, but this parameter is not specified, + the default state is "enabled". + + If you specify a value of "disabled", instance metadata will not be accessible + on the node. + enum: + - enabled + - disabled type: string - message: + httpProtocolIPv6: + default: disabled description: |- - message is a human readable message indicating details about the transition. - This may be an empty string. - maxLength: 32768 + HTTPProtocolIPv6 enables or disables the IPv6 endpoint for the instance metadata + service on provisioned nodes. If metadata options is non-nil, but this parameter + is not specified, the default state is "disabled". + enum: + - enabled + - disabled type: string - observedGeneration: + httpPutResponseHopLimit: + default: 1 description: |- - observedGeneration represents the .metadata.generation that the condition was set based upon. - For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date - with respect to the current state of the instance. + HTTPPutResponseHopLimit is the desired HTTP PUT response hop limit for + instance metadata requests. The larger the number, the further instance + metadata requests can travel. Possible values are integers from 1 to 64. + If metadata options is non-nil, but this parameter is not specified, the + default value is 1. format: int64 - minimum: 0 + maximum: 64 + minimum: 1 type: integer - reason: + httpTokens: + default: required description: |- - reason contains a programmatic identifier indicating the reason for the condition's last transition. - Producers of specific condition types may define expected values and meanings for this field, - and whether the values are considered a guaranteed API. - The value should be a CamelCase string. - This field may not be empty. - maxLength: 1024 - minLength: 1 - pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ - type: string - status: - description: status of the condition, one of True, False, Unknown. + HTTPTokens determines the state of token usage for instance metadata + requests. If metadata options is non-nil, but this parameter is not + specified, the default state is "required". + + If the state is optional, one can choose to retrieve instance metadata with + or without a signed token header on the request. If one retrieves the IAM + role credentials without a token, the version 1.0 role credentials are + returned. If one retrieves the IAM role credentials using a valid signed + token, the version 2.0 role credentials are returned. + + If the state is "required", one must send a signed token header with any + instance metadata retrieval requests. In this state, retrieving the IAM + role credentials always returns the version 2.0 credentials; the version + 1.0 credentials are not available. enum: - - "True" - - "False" - - Unknown - type: string - type: - description: type of condition in CamelCase or in foo.example.com/CamelCase. - maxLength: 316 - pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ - type: string - required: - - lastTransitionTime - - message - - reason - - status - - type - type: object - type: array - instanceProfile: - description: InstanceProfile contains the resolved instance profile - for the role - type: string - securityGroups: - description: |- - SecurityGroups contains the current Security Groups values that are available to the - cluster under the SecurityGroups selectors. - items: - description: SecurityGroup contains resolved SecurityGroup selector - values utilized for node launch - properties: - id: - description: ID of the security group - type: string - name: - description: Name of the security group + - required + - optional type: string - required: - - id type: object - type: array - subnets: - description: |- - Subnets contains the current Subnet values that are available to the - cluster under the subnet selectors. - items: - description: Subnet contains resolved Subnet selector values utilized - for node launch - properties: - id: - description: ID of the subnet - type: string - zone: - description: The associated availability zone - type: string - zoneID: - description: The associated availability zone ID - type: string - required: - - id - - zone + role: + description: |- + Role is the AWS identity that nodes use. This field is immutable. + This field is mutually exclusive from instanceProfile. + Marking this field as immutable avoids concerns around terminating managed instance profiles from running instances. + This field may be made mutable in the future, assuming the correct garbage collection and drift handling is implemented + for the old instance profiles on an update. + type: string + x-kubernetes-validations: + - message: role cannot be empty + rule: self != '' + - message: immutable field changed + rule: self == oldSelf + securityGroupSelectorTerms: + description: SecurityGroupSelectorTerms is a list of or security group selector terms. The terms are ORed. + items: + description: |- + SecurityGroupSelectorTerm defines selection logic for a security group used by Karpenter to launch nodes. + If multiple fields are used for selection, the requirements are ANDed. + properties: + id: + description: ID is the security group id in EC2 + pattern: sg-[0-9a-z]+ + type: string + name: + description: |- + Name is the security group name in EC2. + This value is the name field, which is different from the name tag. + type: string + tags: + additionalProperties: + type: string + description: |- + Tags is a map of key/value tags used to select subnets + Specifying '*' for a value selects all values for a given tag key. + maxProperties: 20 + type: object + x-kubernetes-validations: + - message: empty tag keys or values aren't supported + rule: self.all(k, k != '' && self[k] != '') + type: object + maxItems: 30 + type: array + x-kubernetes-validations: + - message: securityGroupSelectorTerms cannot be empty + rule: self.size() != 0 + - message: expected at least one, got none, ['tags', 'id', 'name'] + rule: self.all(x, has(x.tags) || has(x.id) || has(x.name)) + - message: '''id'' is mutually exclusive, cannot be set with a combination of other fields in securityGroupSelectorTerms' + rule: '!self.all(x, has(x.id) && (has(x.tags) || has(x.name)))' + - message: '''name'' is mutually exclusive, cannot be set with a combination of other fields in securityGroupSelectorTerms' + rule: '!self.all(x, has(x.name) && (has(x.tags) || has(x.id)))' + subnetSelectorTerms: + description: SubnetSelectorTerms is a list of or subnet selector terms. The terms are ORed. + items: + description: |- + SubnetSelectorTerm defines selection logic for a subnet used by Karpenter to launch nodes. + If multiple fields are used for selection, the requirements are ANDed. + properties: + id: + description: ID is the subnet id in EC2 + pattern: subnet-[0-9a-z]+ + type: string + tags: + additionalProperties: + type: string + description: |- + Tags is a map of key/value tags used to select subnets + Specifying '*' for a value selects all values for a given tag key. + maxProperties: 20 + type: object + x-kubernetes-validations: + - message: empty tag keys or values aren't supported + rule: self.all(k, k != '' && self[k] != '') + type: object + maxItems: 30 + type: array + x-kubernetes-validations: + - message: subnetSelectorTerms cannot be empty + rule: self.size() != 0 + - message: expected at least one, got none, ['tags', 'id'] + rule: self.all(x, has(x.tags) || has(x.id)) + - message: '''id'' is mutually exclusive, cannot be set with a combination of other fields in subnetSelectorTerms' + rule: '!self.all(x, has(x.id) && has(x.tags))' + tags: + additionalProperties: + type: string + description: Tags to be applied on ec2 resources like instances and launch templates. type: object - type: array - type: object - type: object - served: true - storage: true - subresources: - status: {} + x-kubernetes-validations: + - message: empty tag keys aren't supported + rule: self.all(k, k != '') + - message: tag contains a restricted tag matching eks:eks-cluster-name + rule: self.all(k, k !='eks:eks-cluster-name') + - message: tag contains a restricted tag matching kubernetes.io/cluster/ + rule: self.all(k, !k.startsWith('kubernetes.io/cluster') ) + - message: tag contains a restricted tag matching karpenter.sh/nodepool + rule: self.all(k, k != 'karpenter.sh/nodepool') + - message: tag contains a restricted tag matching karpenter.sh/nodeclaim + rule: self.all(k, k !='karpenter.sh/nodeclaim') + - message: tag contains a restricted tag matching karpenter.k8s.aws/ec2nodeclass + rule: self.all(k, k !='karpenter.k8s.aws/ec2nodeclass') + userData: + description: |- + UserData to be applied to the provisioned nodes. + It must be in the appropriate format based on the AMIFamily in use. Karpenter will merge certain fields into + this UserData to ensure nodes are being provisioned with the correct configuration. + type: string + required: + - amiSelectorTerms + - securityGroupSelectorTerms + - subnetSelectorTerms + type: object + x-kubernetes-validations: + - message: must specify exactly one of ['role', 'instanceProfile'] + rule: (has(self.role) && !has(self.instanceProfile)) || (!has(self.role) && has(self.instanceProfile)) + - message: changing from 'instanceProfile' to 'role' is not supported. You must delete and recreate this node class if you want to change this. + rule: (has(oldSelf.role) && has(self.role)) || (has(oldSelf.instanceProfile) && has(self.instanceProfile)) + - message: if set, amiFamily must be 'AL2' or 'Custom' when using an AL2 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''al2'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''AL2'') : true)' + - message: if set, amiFamily must be 'AL2023' or 'Custom' when using an AL2023 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''al2023'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''AL2023'') : true)' + - message: if set, amiFamily must be 'Bottlerocket' or 'Custom' when using a Bottlerocket alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''bottlerocket'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''Bottlerocket'') : true)' + - message: if set, amiFamily must be 'Windows2019' or 'Custom' when using a Windows2019 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''windows2019'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''Windows2019'') : true)' + - message: if set, amiFamily must be 'Windows2022' or 'Custom' when using a Windows2022 alias + rule: '!has(self.amiFamily) || (self.amiSelectorTerms.exists(x, has(x.alias) && x.alias.find(''^[^@]+'') == ''windows2022'') ? (self.amiFamily == ''Custom'' || self.amiFamily == ''Windows2022'') : true)' + - message: must specify amiFamily if amiSelectorTerms does not contain an alias + rule: 'self.amiSelectorTerms.exists(x, has(x.alias)) ? true : has(self.amiFamily)' + status: + description: EC2NodeClassStatus contains the resolved state of the EC2NodeClass + properties: + amis: + description: |- + AMI contains the current AMI values that are available to the + cluster under the AMI selectors. + items: + description: AMI contains resolved AMI selector values utilized for node launch + properties: + id: + description: ID of the AMI + type: string + name: + description: Name of the AMI + type: string + requirements: + description: Requirements of the AMI to be utilized on an instance type + items: + description: |- + A node selector requirement is a selector that contains values, a key, and an operator + that relates the key and values. + properties: + key: + description: The label key that the selector applies to. + type: string + operator: + description: |- + Represents a key's relationship to a set of values. + Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. + type: string + values: + description: |- + An array of string values. If the operator is In or NotIn, + the values array must be non-empty. If the operator is Exists or DoesNotExist, + the values array must be empty. If the operator is Gt or Lt, the values + array must have a single element, which will be interpreted as an integer. + This array is replaced during a strategic merge patch. + items: + type: string + type: array + x-kubernetes-list-type: atomic + required: + - key + - operator + type: object + type: array + required: + - id + - requirements + type: object + type: array + conditions: + description: Conditions contains signals for health and readiness + items: + description: Condition aliases the upstream type and adds additional helper methods + properties: + lastTransitionTime: + description: |- + lastTransitionTime is the last time the condition transitioned from one status to another. + This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. + format: date-time + type: string + message: + description: |- + message is a human readable message indicating details about the transition. + This may be an empty string. + maxLength: 32768 + type: string + observedGeneration: + description: |- + observedGeneration represents the .metadata.generation that the condition was set based upon. + For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date + with respect to the current state of the instance. + format: int64 + minimum: 0 + type: integer + reason: + description: |- + reason contains a programmatic identifier indicating the reason for the condition's last transition. + Producers of specific condition types may define expected values and meanings for this field, + and whether the values are considered a guaranteed API. + The value should be a CamelCase string. + This field may not be empty. + maxLength: 1024 + minLength: 1 + pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ + type: string + status: + description: status of the condition, one of True, False, Unknown. + enum: + - "True" + - "False" + - Unknown + type: string + type: + description: type of condition in CamelCase or in foo.example.com/CamelCase. + maxLength: 316 + pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ + type: string + required: + - lastTransitionTime + - message + - reason + - status + - type + type: object + type: array + instanceProfile: + description: InstanceProfile contains the resolved instance profile for the role + type: string + securityGroups: + description: |- + SecurityGroups contains the current Security Groups values that are available to the + cluster under the SecurityGroups selectors. + items: + description: SecurityGroup contains resolved SecurityGroup selector values utilized for node launch + properties: + id: + description: ID of the security group + type: string + name: + description: Name of the security group + type: string + required: + - id + type: object + type: array + subnets: + description: |- + Subnets contains the current Subnet values that are available to the + cluster under the subnet selectors. + items: + description: Subnet contains resolved Subnet selector values utilized for node launch + properties: + id: + description: ID of the subnet + type: string + zone: + description: The associated availability zone + type: string + zoneID: + description: The associated availability zone ID + type: string + required: + - id + - zone + type: object + type: array + type: object + type: object + served: true + storage: true + subresources: + status: {} diff --git a/pkg/apis/crds/karpenter.sh_nodeclaims.yaml b/pkg/apis/crds/karpenter.sh_nodeclaims.yaml index e70b6e2af752..02fa4861acf5 100644 --- a/pkg/apis/crds/karpenter.sh_nodeclaims.yaml +++ b/pkg/apis/crds/karpenter.sh_nodeclaims.yaml @@ -120,6 +120,8 @@ spec: rule: self in ["karpenter.sh/capacity-type", "karpenter.sh/nodepool"] || !self.find("^([^/]+)").endsWith("karpenter.sh") - message: label "kubernetes.io/hostname" is restricted rule: self != "kubernetes.io/hostname" + - message: label domain "karpenter.k8s.aws" is restricted + rule: self in ["karpenter.k8s.aws/instance-encryption-in-transit-supported", "karpenter.k8s.aws/instance-category", "karpenter.k8s.aws/instance-hypervisor", "karpenter.k8s.aws/instance-family", "karpenter.k8s.aws/instance-generation", "karpenter.k8s.aws/instance-local-nvme", "karpenter.k8s.aws/instance-size", "karpenter.k8s.aws/instance-cpu","karpenter.k8s.aws/instance-cpu-manufacturer","karpenter.k8s.aws/instance-memory", "karpenter.k8s.aws/instance-ebs-bandwidth", "karpenter.k8s.aws/instance-network-bandwidth", "karpenter.k8s.aws/instance-gpu-name", "karpenter.k8s.aws/instance-gpu-manufacturer", "karpenter.k8s.aws/instance-gpu-count", "karpenter.k8s.aws/instance-gpu-memory", "karpenter.k8s.aws/instance-accelerator-name", "karpenter.k8s.aws/instance-accelerator-manufacturer", "karpenter.k8s.aws/instance-accelerator-count"] || !self.find("^([^/]+)").endsWith("karpenter.k8s.aws") minValues: description: |- This field is ALPHA and can be dropped or replaced at any time diff --git a/pkg/apis/crds/karpenter.sh_nodepools.yaml b/pkg/apis/crds/karpenter.sh_nodepools.yaml index a22d8befeb52..0894d11feecb 100644 --- a/pkg/apis/crds/karpenter.sh_nodepools.yaml +++ b/pkg/apis/crds/karpenter.sh_nodepools.yaml @@ -208,6 +208,8 @@ spec: rule: self.all(x, x != "karpenter.sh/nodepool") - message: label "kubernetes.io/hostname" is restricted rule: self.all(x, x != "kubernetes.io/hostname") + - message: label domain "karpenter.k8s.aws" is restricted + rule: self.all(x, x in ["karpenter.k8s.aws/instance-encryption-in-transit-supported", "karpenter.k8s.aws/instance-category", "karpenter.k8s.aws/instance-hypervisor", "karpenter.k8s.aws/instance-family", "karpenter.k8s.aws/instance-generation", "karpenter.k8s.aws/instance-local-nvme", "karpenter.k8s.aws/instance-size", "karpenter.k8s.aws/instance-cpu","karpenter.k8s.aws/instance-cpu-manufacturer","karpenter.k8s.aws/instance-memory", "karpenter.k8s.aws/instance-ebs-bandwidth", "karpenter.k8s.aws/instance-network-bandwidth", "karpenter.k8s.aws/instance-gpu-name", "karpenter.k8s.aws/instance-gpu-manufacturer", "karpenter.k8s.aws/instance-gpu-count", "karpenter.k8s.aws/instance-gpu-memory", "karpenter.k8s.aws/instance-accelerator-name", "karpenter.k8s.aws/instance-accelerator-manufacturer", "karpenter.k8s.aws/instance-accelerator-count"] || !x.find("^([^/]+)").endsWith("karpenter.k8s.aws")) type: object spec: description: |- @@ -265,6 +267,8 @@ spec: rule: self != "karpenter.sh/nodepool" - message: label "kubernetes.io/hostname" is restricted rule: self != "kubernetes.io/hostname" + - message: label domain "karpenter.k8s.aws" is restricted + rule: self in ["karpenter.k8s.aws/instance-encryption-in-transit-supported", "karpenter.k8s.aws/instance-category", "karpenter.k8s.aws/instance-hypervisor", "karpenter.k8s.aws/instance-family", "karpenter.k8s.aws/instance-generation", "karpenter.k8s.aws/instance-local-nvme", "karpenter.k8s.aws/instance-size", "karpenter.k8s.aws/instance-cpu","karpenter.k8s.aws/instance-cpu-manufacturer","karpenter.k8s.aws/instance-memory", "karpenter.k8s.aws/instance-ebs-bandwidth", "karpenter.k8s.aws/instance-network-bandwidth", "karpenter.k8s.aws/instance-gpu-name", "karpenter.k8s.aws/instance-gpu-manufacturer", "karpenter.k8s.aws/instance-gpu-count", "karpenter.k8s.aws/instance-gpu-memory", "karpenter.k8s.aws/instance-accelerator-name", "karpenter.k8s.aws/instance-accelerator-manufacturer", "karpenter.k8s.aws/instance-accelerator-count"] || !self.find("^([^/]+)").endsWith("karpenter.k8s.aws") minValues: description: |- This field is ALPHA and can be dropped or replaced at any time From 309c27dc47ab5a863ce5ba53717289f23c7cd99e Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Mon, 28 Oct 2024 18:49:20 -0500 Subject: [PATCH 14/15] docs: Update docs to reflect Neuron scheduler impact and changes to Neuron accelerator name well known label --- website/content/en/preview/concepts/scheduling.md | 14 +++++++++++--- .../content/en/preview/upgrading/upgrade-guide.md | 1 + 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/website/content/en/preview/concepts/scheduling.md b/website/content/en/preview/concepts/scheduling.md index 161edc87a3b9..8cd9e6a0b568 100755 --- a/website/content/en/preview/concepts/scheduling.md +++ b/website/content/en/preview/concepts/scheduling.md @@ -89,15 +89,23 @@ spec: nvidia.com/gpu: "1" ``` {{% alert title="Note" color="primary" %}} -If you are provisioning GPU nodes, you need to deploy an appropriate GPU device plugin daemonset for those nodes. -Without the daemonset running, Karpenter will not see those nodes as initialized. +If you are provisioning nodes that will utilize accelerators/GPUs, you need to deploy the appropriate device plugin daemonset. +Without the respective device plugin daemonset, Karpenter will not see those nodes as initialized. Refer to general [Kubernetes GPU](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#deploying-amd-gpu-device-plugin) docs and the following specific GPU docs: * `nvidia.com/gpu`: [NVIDIA device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin) * `amd.com/gpu`: [AMD GPU device plugin for Kubernetes](https://github.com/RadeonOpenCompute/k8s-device-plugin) -* `aws.amazon.com/neuron`: [Kubernetes environment setup for Neuron](https://github.com/aws-neuron/aws-neuron-sdk/tree/master/src/k8) +* `aws.amazon.com/neuron`/`aws.amazon.com/neuroncore`: [AWS Neuron device plugin for Kubernetes](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/containers/kubernetes-getting-started.html#neuron-device-plugin) * `habana.ai/gaudi`: [Habana device plugin for Kubernetes](https://docs.habana.ai/en/latest/Orchestration/Gaudi_Kubernetes/Habana_Device_Plugin_for_Kubernetes.html) {{% /alert %}} +#### AWS Neuron Resources + +The [Neuron scheduler extension](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/containers/kubernetes-getting-started.html#neuron-scheduler-extension) is required for pods that require more than one Neuron core (`aws.amazon.com/neuroncore`) or device (`aws.amazon.com/neuron`) resource, but less than all available Neuron cores or devices on a node. From the AWS Neuron documentation: + +> The Neuron scheduler extension finds sets of directly connected devices with minimal communication latency when scheduling containers. On Inf1 and Inf2 instance types where Neuron devices are connected through a ring topology, the scheduler finds sets of contiguous devices. For example, for a container requesting 3 Neuron devices the scheduler might assign Neuron devices 0,1,2 to the container if they are available but never devices 0,2,4 because those devices are not directly connected. On Trn1.32xlarge and Trn1n.32xlarge instance types where devices are connected through a 2D torus topology, the Neuron scheduler enforces additional constraints that containers request 1, 4, 8, or all 16 devices. If your container requires a different number of devices, such as 2 or 5, we recommend that you use an Inf2 instance instead of Trn1 to benefit from more advanced topology. + +However, Karpenter is not aware of the decisions made by the Neuron scheduler extension which precludes it from making any optimizations to consolidate and bin pack pods requiring Neuron resources. + ### Pod ENI Resources (Security Groups for Pods) [Pod ENI](https://github.com/aws/amazon-vpc-cni-k8s#enable_pod_eni-v170) is a feature of the AWS VPC CNI Plugin which allows an Elastic Network Interface (ENI) to be allocated directly to a Pod. When enabled, the `vpc.amazonaws.com/pod-eni` extended resource is added to supported nodes. The Pod ENI feature can be used independently, but is most often used in conjunction with Security Groups for Pods. Follow the below instructions to enable support for Pod ENI and/or Security Groups for Pods in Karpenter. diff --git a/website/content/en/preview/upgrading/upgrade-guide.md b/website/content/en/preview/upgrading/upgrade-guide.md index 4a5289d40768..f4f0bfaae3e3 100644 --- a/website/content/en/preview/upgrading/upgrade-guide.md +++ b/website/content/en/preview/upgrading/upgrade-guide.md @@ -37,6 +37,7 @@ WHEN CREATING A NEW SECTION OF THE UPGRADE GUIDANCE FOR NEWER VERSIONS, ENSURE T * Bottlerocket AMIFamily now supports `instanceStorePolicy: RAID0`. This means that Karpenter will auto-generate userData to RAID0 your instance store volumes (similar to AL2 and AL2023) when specifying this value. * Note: This userData configuration is _only_ valid on Bottlerocket v1.22.0+. If you are using an earlier version of a Bottlerocket image (< v1.22.0) with `amiFamily: Bottlerocket` and `instanceStorePolicy: RAID0`, nodes will fail to join the cluster. +* The AWS Neuron accelerator well known name label (`karpenter.k8s.aws/instance-accelerator-name`) values now reflect their correct names of `trainium`, `inferentia`, and `inferentia2`. Previously, all Neuron accelerators were assigned the label name of `inferentia`. ### Upgrading to `1.0.0`+ From 30e265d41d1be6ee41a87c1b45fbd457fc78f190 Mon Sep 17 00:00:00 2001 From: Bryant Biggs Date: Thu, 31 Oct 2024 22:23:37 +0000 Subject: [PATCH 15/15] Update scheduling.md Co-authored-by: Jason Deal --- website/content/en/preview/concepts/scheduling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/content/en/preview/concepts/scheduling.md b/website/content/en/preview/concepts/scheduling.md index 8cd9e6a0b568..e5f8cd6dc8ae 100755 --- a/website/content/en/preview/concepts/scheduling.md +++ b/website/content/en/preview/concepts/scheduling.md @@ -104,7 +104,7 @@ The [Neuron scheduler extension](https://awsdocs-neuron.readthedocs-hosted.com/e > The Neuron scheduler extension finds sets of directly connected devices with minimal communication latency when scheduling containers. On Inf1 and Inf2 instance types where Neuron devices are connected through a ring topology, the scheduler finds sets of contiguous devices. For example, for a container requesting 3 Neuron devices the scheduler might assign Neuron devices 0,1,2 to the container if they are available but never devices 0,2,4 because those devices are not directly connected. On Trn1.32xlarge and Trn1n.32xlarge instance types where devices are connected through a 2D torus topology, the Neuron scheduler enforces additional constraints that containers request 1, 4, 8, or all 16 devices. If your container requires a different number of devices, such as 2 or 5, we recommend that you use an Inf2 instance instead of Trn1 to benefit from more advanced topology. -However, Karpenter is not aware of the decisions made by the Neuron scheduler extension which precludes it from making any optimizations to consolidate and bin pack pods requiring Neuron resources. +However, Karpenter is not aware of the decisions made by the Neuron scheduler extension which precludes it from making any optimizations to consolidate and bin pack pods requiring Neuron resources. To ensure Karpenter's bin-packing is consistent with the decisions made by the scheduler extension, containers must have like-sized, power of 2 requests (e.g. 1, 2, 4, etc). Failing to do so may result in permanently pending pods. ### Pod ENI Resources (Security Groups for Pods) [Pod ENI](https://github.com/aws/amazon-vpc-cni-k8s#enable_pod_eni-v170) is a feature of the AWS VPC CNI Plugin which allows an Elastic Network Interface (ENI) to be allocated directly to a Pod. When enabled, the `vpc.amazonaws.com/pod-eni` extended resource is added to supported nodes. The Pod ENI feature can be used independently, but is most often used in conjunction with Security Groups for Pods. Follow the below instructions to enable support for Pod ENI and/or Security Groups for Pods in Karpenter.