Bug Description
When --enable-endpoint-slices=true (the default since v3.3.0), the controller attempts to register IPv4 pod addresses into IPv6-only target groups for services with ipFamilyPolicy: RequireDualStack and ipFamilies: [IPv6, IPv4]. AWS rejects these with:
ValidationError: The IP address '10.35.x.x' is not a valid IPv6 address
The affected pods' readiness gates (target-health.elbv2.k8s.aws/*) are never satisfied and the pods remain stuck in not-ready. Existing pods with stale pre-upgrade registrations continue serving traffic, masking the impact until the next pod reschedule.
Root Cause
computeServiceEndpointsData in pkg/backend/endpoint_resolver.go lists all EndpointSlices for a service without filtering by address type:
r.k8sClient.List(ctx, epSliceList,
client.InNamespace(svcKey.Namespace),
client.MatchingLabels{discovery.LabelServiceName: svcKey.Name})
// returns BOTH addressType=IPv4 and addressType=IPv6 slices
For a RequireDualStack [IPv6, IPv4] service, Kubernetes creates two EndpointSlices — one per address family. Both are merged into a single flat list and fed into resolvePodEndpointsWithEndpointsData. This produces two PodEndpoint entries per pod (one with the IPv4 address, one with the IPv6 address).
When matchPodEndpointWithTargets runs against a TGB with ipAddressType: ipv6, the IPv6 endpoints match existing targets (already registered) while the IPv4 endpoints appear as unmatched and are submitted to RegisterTargets. AWS rejects the call since the target group is IPv6-only.
The bug existed since EndpointSlice support was added (PR #2169, Sep 2021) but was dormant because --enable-endpoint-slices defaulted to false. PR #4353 (merged Sep 2025, shipped in v3.3.0) flipped the default to true, exposing the bug.
With --enable-endpoint-slices=false (the pre-v3.3.0 default), the legacy corev1.Endpoints path is used, which only exposes the primary IP family — for ipFamilies: [IPv6, IPv4] that is IPv6 — so the bug is never triggered.
Steps to Reproduce
- Dual-stack cluster where pods receive both IPv4 and IPv6 addresses
- Service with
ipFamilyPolicy: RequireDualStack and ipFamilies: [IPv6, IPv4]
- Ingress with
alb.ingress.kubernetes.io/ip-address-type: dualstack and target-type: ip
- Controller running with
--enable-endpoint-slices=true (default in v3.3.0+)
Controller continuously logs:
Reconciler error controller=targetGroupBinding
error="operation error Elastic Load Balancing v2: RegisterTargets,
api error ValidationError: The IP address '10.x.x.x' is not a valid IPv6 address"
Expected Behavior
For a TGB with ipAddressType: ipv6, only IPv6 pod addresses should be submitted to RegisterTargets. The controller should not attempt to register IPv4 addresses into an IPv6-only target group.
Workaround
Set --enable-endpoint-slices=false to revert to the legacy Endpoints path. This restores the pre-v3.3.0 behaviour at the cost of not using EndpointSlices.
Environment
- Controller version: v3.3.0 (also reproducible on v2.13.4 with
--enable-endpoint-slices=true)
- Not affected:
SingleStack IPv6 services (only one EndpointSlice, no IPv4 addresses to submit)
Bug Description
When
--enable-endpoint-slices=true(the default since v3.3.0), the controller attempts to register IPv4 pod addresses into IPv6-only target groups for services withipFamilyPolicy: RequireDualStackandipFamilies: [IPv6, IPv4]. AWS rejects these with:The affected pods' readiness gates (
target-health.elbv2.k8s.aws/*) are never satisfied and the pods remain stuck in not-ready. Existing pods with stale pre-upgrade registrations continue serving traffic, masking the impact until the next pod reschedule.Root Cause
computeServiceEndpointsDatainpkg/backend/endpoint_resolver.golists all EndpointSlices for a service without filtering by address type:For a
RequireDualStack [IPv6, IPv4]service, Kubernetes creates two EndpointSlices — one per address family. Both are merged into a single flat list and fed intoresolvePodEndpointsWithEndpointsData. This produces twoPodEndpointentries per pod (one with the IPv4 address, one with the IPv6 address).When
matchPodEndpointWithTargetsruns against a TGB withipAddressType: ipv6, the IPv6 endpoints match existing targets (already registered) while the IPv4 endpoints appear as unmatched and are submitted toRegisterTargets. AWS rejects the call since the target group is IPv6-only.The bug existed since EndpointSlice support was added (PR #2169, Sep 2021) but was dormant because
--enable-endpoint-slicesdefaulted tofalse. PR #4353 (merged Sep 2025, shipped in v3.3.0) flipped the default totrue, exposing the bug.With
--enable-endpoint-slices=false(the pre-v3.3.0 default), the legacycorev1.Endpointspath is used, which only exposes the primary IP family — foripFamilies: [IPv6, IPv4]that is IPv6 — so the bug is never triggered.Steps to Reproduce
ipFamilyPolicy: RequireDualStackandipFamilies: [IPv6, IPv4]alb.ingress.kubernetes.io/ip-address-type: dualstackandtarget-type: ip--enable-endpoint-slices=true(default in v3.3.0+)Controller continuously logs:
Expected Behavior
For a TGB with
ipAddressType: ipv6, only IPv6 pod addresses should be submitted toRegisterTargets. The controller should not attempt to register IPv4 addresses into an IPv6-only target group.Workaround
Set
--enable-endpoint-slices=falseto revert to the legacyEndpointspath. This restores the pre-v3.3.0 behaviour at the cost of not using EndpointSlices.Environment
--enable-endpoint-slices=true)SingleStack IPv6services (only one EndpointSlice, no IPv4 addresses to submit)