-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Describe what happened
Setting up a cluster with ipv6 and the default node security group leads to a non working state. CoreDNS is crash looping.
These crash loops are caused (as far, as I understand it correctly) by the default node security group not allowing ipv6 traffic to the Kubernetes API server.
Sample program
The following snippet reproduces the problem. The referenced VPC is just a VPC with ipv6 enabled (as the default VPC is not ipv6 enabled).
⚠ I manually added the AWS CNI ipv6 policy in the web console, this is another issue. Added exactly the policy as stated in the aws docs.
new eks.Cluster('test', {
ipFamily: 'ipv6',
vpcId: vpc.vpc.id,
subnetIds: vpc.subnetIds,
})Log output
CoreDNS complaining with error that Kubernetes control plane is not reachable. Only applicable if you don't use VPC CNI Plugin, else CoreDNS will be in pending state forever with "failed to setup network for sandbox" and "failed to assign an IP address to container".
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/ready: Still waiting on: "kubernetes"
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration SHA512 = 8a7d59126e7f114ab49c6d2613be93d8ef7d408af8ee61a710210843dc409f03133727e38f64469d9bb180f396c84ebf48a42bde3b3769730865ca9df5eb281c
CoreDNS-1.11.4
linux/arm64, go1.23.4, 893c3c661
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.Service: Get "https://[fd43:db82:c27f::1]:443/api/v1/services?limit=500&resourceVersion=0": dial tcp [fd43:db82:c27f::1]:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.Namespace: Get "https://[fd43:db82:c27f::1]:443/api/v1/namespaces?limit=500&resourceVersion=0": dial tcp [fd43:db82:c27f::1]:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.EndpointSlice: Get "https://[fd43:db82:c27f::1]:443/apis/discovery.k8s.io/v1/endpointslices?limit=500&resourceVersion=0": dial tcp [fd43:db82:c27f::1]:443: i/o timeout
[ERROR] plugin/kubernetes: Unhandled Error
[INFO] plugin/ready: Still waiting on: "kubernetes"
Affected Resource(s)
Every workload communicating with any ipv6 internet address, even the Kubernetes control plane is not reachable.
Output of pulumi about
CLI
Version 3.156.0
Go Version go1.24.1
Go Compiler gc
Plugins
KIND NAME VERSION
language nodejs 3.156.0
Host
OS arch
Version
Arch x86_64
This project is written in nodejs: executable='/usr/bin/node' version='v22.14.0'
Current Stack: organization/k8s-cluster-core/foo
TYPE URN
...
Additional context
After adding the allow internet ipv6 rule, everything seems to work like expected.
new aws.ec2.SecurityGroupRule(
`${name}-eksNodeInternetEgressV6Rule`,
{
description: 'Allow internet v6 access.',
type: 'egress',
fromPort: 0,
toPort: 0,
protocol: '-1', // all
ipv6CidrBlocks: ['::/0'],
securityGroupId: nodeSecurityGroupId,
}
)Probably this snippet needs to be added here or here (I'm a bit confused about sdk vs nodejs folder)
Contributing
Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).