Skip to content

Stuck on "dial tcp i/o timeout" Error? AWS Load Balancer Controller in Kubernetes(Non-EKS) #4018

Open
@ChubbyKay

Description

@ChubbyKay

Hello, I encountered issues when setting up AWS Load Balancer Controller in Kubernetes. Despite multiple attempts, the controller fails to function properly. I would appreciate your assistance in diagnosing and resolving the issue.

Background Information

  • Kubernetes Version: 1.31.1 (built using Kubespray)
  • CNI Plugins: Initially Calico, later switched to amazon-vpc-cni-k8s (ECR region changed to ap-northeast-1, all aws-node Pods are in Running state).
  • Installation Method: Using Helm (version v3.16.3)
    • Command used:
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--namespace kube-system \
--set clusterName=my-cluster \
--set region=ap-northeast-1 \
--set vpcId=vpc-xxxxxxxxxxxxxxxxx \
--set serviceAccount.create=true
  • IAM Role Configuration:
    • Policies attached:
      • AmazonEC2ContainerRegistryReadOnly
      • AmazonEKS_CNI_Policy
      • AmazonEKSClusterPolicy
      • AWSLoadBalancerControllerIAMPolicy
    • The IAM Role is bound to the nodes.
  • Subnet Tags: kubernetes.io/cluster/ set to owned.
  • Pod CIDR: 10.233.64.0/18
  • Security Group Rules: All opened to 0.0.0.0/0 for both ingress and egress.
  • IMDSv2 Configuration:
    • HttpTokens: required
    • HttpPutResponseHopLimit: 2

Issue Description
Problem 1
After installation, the aws-load-balancer-controller Pod fails to run properly. Logs show the following error:

{"level":"error","ts":"2025-01-15T03:38:31Z","logger":"setup","msg":"unable to create controller","controller":"Ingress","error":"Get \"https://xx.xxx.x.x:443/apis/networking.k8s.io/v1\": dial tcp xx.xxx.x.x:443: i/o timeout"}

Problem 2
In a previous attempt, I noticed that the ServiceAccount associated with the controller cannot mount any tokens or secrets:

kubectl describe serviceaccount aws-load-balancer-controller -n kube-system
Name:                aws-load-balancer-controller
Namespace:           kube-system
Labels:              app.kubernetes.io/instance=aws-load-balancer-controller
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=aws-load-balancer-controller
                      app.kubernetes.io/version=v2.11.0
                      helm.sh/chart=aws-load-balancer-controller-1.11.0
Annotations:     meta.helm.sh/release-name: aws-load-balancer-controller
                      meta.helm.sh/release-namespace: kube-system
Image pull secrets:  <none>
Mountable secrets:   <none>
Tokens:              <none>
Events:              <none>

Help Needed

  1. Error Analysis: What could be causing the dial tcp xx.xxx.x.x:443: i/o timeout error? Is it related to networking, CNI, or other configurations?
  2. Installation Guidance: If there are misconfigurations, how can I fix them to make the controller work properly?
  3. Alternative Methods: Are there other ways to implement a high-availability load balancer compatible with the AWS environment?
  4. Best Practices: Any recommendations for optimal configurations or installation parameters would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions