Description
Hello, I encountered issues when setting up AWS Load Balancer Controller in Kubernetes. Despite multiple attempts, the controller fails to function properly. I would appreciate your assistance in diagnosing and resolving the issue.
Background Information
- Kubernetes Version: 1.31.1 (built using Kubespray)
- CNI Plugins: Initially Calico, later switched to amazon-vpc-cni-k8s (ECR region changed to ap-northeast-1, all aws-node Pods are in Running state).
- Installation Method: Using Helm (version v3.16.3)
- Command used:
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--namespace kube-system \
--set clusterName=my-cluster \
--set region=ap-northeast-1 \
--set vpcId=vpc-xxxxxxxxxxxxxxxxx \
--set serviceAccount.create=true
- IAM Role Configuration:
- Policies attached:
- AmazonEC2ContainerRegistryReadOnly
- AmazonEKS_CNI_Policy
- AmazonEKSClusterPolicy
- AWSLoadBalancerControllerIAMPolicy
- The IAM Role is bound to the nodes.
- Policies attached:
- Subnet Tags: kubernetes.io/cluster/ set to owned.
- Pod CIDR: 10.233.64.0/18
- Security Group Rules: All opened to 0.0.0.0/0 for both ingress and egress.
- IMDSv2 Configuration:
- HttpTokens: required
- HttpPutResponseHopLimit: 2
Issue Description
Problem 1
After installation, the aws-load-balancer-controller Pod fails to run properly. Logs show the following error:
{"level":"error","ts":"2025-01-15T03:38:31Z","logger":"setup","msg":"unable to create controller","controller":"Ingress","error":"Get \"https://xx.xxx.x.x:443/apis/networking.k8s.io/v1\": dial tcp xx.xxx.x.x:443: i/o timeout"}
Problem 2
In a previous attempt, I noticed that the ServiceAccount associated with the controller cannot mount any tokens or secrets:
kubectl describe serviceaccount aws-load-balancer-controller -n kube-system
Name: aws-load-balancer-controller
Namespace: kube-system
Labels: app.kubernetes.io/instance=aws-load-balancer-controller
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=aws-load-balancer-controller
app.kubernetes.io/version=v2.11.0
helm.sh/chart=aws-load-balancer-controller-1.11.0
Annotations: meta.helm.sh/release-name: aws-load-balancer-controller
meta.helm.sh/release-namespace: kube-system
Image pull secrets: <none>
Mountable secrets: <none>
Tokens: <none>
Events: <none>
Help Needed
- Error Analysis: What could be causing the dial tcp xx.xxx.x.x:443: i/o timeout error? Is it related to networking, CNI, or other configurations?
- Installation Guidance: If there are misconfigurations, how can I fix them to make the controller work properly?
- Alternative Methods: Are there other ways to implement a high-availability load balancer compatible with the AWS environment?
- Best Practices: Any recommendations for optimal configurations or installation parameters would be greatly appreciated!