Description
Describe the bug
Cluster provisioning fails when spot interrupt handler is set to true and ASG capacity is used with the latest Kubernetes versions.
Created a simple cluster with EKS Blueprints for CDK and used ASG capacity provider.
CDK code is using cluster.addAutoScalingGroupCapacity
with spotInterruptHandler set to true
(default setting).
Getting the following exception:
Received response status [FAILED] from custom resource. Message returned: Error: b'Release "asgtestchartspotinterrupthandler88cd0a56" does not exist. Installing it now.\nError: unable to build kubernetes objects from release manifest: resource mapping not found for name: "asgtestchartspotinterrupthandler88cd0a56-aws-node-termination-h" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"\nensure CRDs are installed first\n'
Logs: /aws/lambda/asg-test-awscdkawseksKubectlProvid-Handler886CB40B-VjmYzuKObYxM
at invokeUserFunction (/var/task/framework.js:2:6)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async onEvent (/var/task/framework.js:1:369)
at async Runtime.handler (/var/task/cfn-response.js:1:1837) (RequestId: d7f295fe-9046-45b5-bbe3-fcc72f9cc84d)
Similar results when setting Kubernetes version to 1.29 and 1.30.
Narrowed down to this code: https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-eks/lib/cluster.ts#L1163-L1176
Why is helm chart version hardcoded?
Regression Issue
- Select this option if this issue appears to be a regression.
Last Known Working CDK Version
No response
Expected Behavior
Cluster provisioned with ASG capacity and spot interrupt handler installed.
Current Behavior
CFN provisioning failed with the exception described in the body of the issue.
Reproduction Steps
Created a simple cluster with EKS Blueprints for CDK and used ASG capacity provider.
CDK code is using cluster.addAutoScalingGroupCapacity
with spotInterruptHandler set to true
(default setting).
Possible Solution
maintain a map of chart versions for node termination handler that are supported by the latest Kubernetes/EKS versions or allow customers to pass the version (less preferred).
Additional Information/Context
Potential workaround is to disable spot interrupt handler and install node termination helm chart with the correct helm chart version, e.g.
version: "0.25.1",
repository: 'oci://public.ecr.aws/aws-ec2/helm/aws-node-termination-handler',
CDK CLI Version
2.173.4
Framework Version
No response
Node.js Version
20.10
OS
MacOS
Language
TypeScript
Language Version
No response
Other information
No response