Skip to content

[Usernetes] kube-apiserver fails to connect to etcd: dial tcp 127.0.0.1:2379: connect: connection refused #65

Open
@AkihiroSuda

Description

@AkihiroSuda

I'm trying to run Usernetes (single-node w/o VXLAN, as a baby step) with bypass4netnsd, but kubeadm fails:

$ export CONTAINER_ENGINE=nerdctl
$ make up
$ make kubeadm-init
[...]
kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Looks like kube-apiserver is failing to connect to the local etcd due to dial tcp 127.0.0.1:2379: connect: connection refused,
although the etcd process is running with --listen-client-urls=https://127.0.0.1:2379,https://10.100.201.100:2379.

Version

  • nerdctl: v2.0.0-beta.3
  • bypass4netns: the current master 2794f7e
  • Usernetes: gen2-v20240404.1 + the following annotations
diff --git a/docker-compose.yaml b/docker-compose.yaml
index 2ae7291..a036d80 100644
--- a/docker-compose.yaml
+++ b/docker-compose.yaml
@@ -39,6 +39,11 @@ services:
       # In addition, `net.ipv4.conf.default.rp_filter`
       # has to be set to 0 (disabled) or 2 (loose)
       # in the daemon's network namespace.
+    annotations:
+      # bypass4netns annotations are recognized since nerdctl v2.0
+      # TODO: enable bypass4netns only when bypass4netnsd is running.
+      "nerdctl/bypass4netns": "true"
+      "nerdctl/bypass4netns-ignore-subnets": "[\"10.244.0.0/16\", \"${U7S_NODE_SUBNET}\"]"
 networks:
   default:
     ipam:

Logs

$ nerdctl exec usernetes-node-1 sh -euxc 'tail /var/log/containers/kube-apiserver*'
+ tail /var/log/containers/kube-apiserver-u7s-suda-ws01_kube-system_kube-apiserver-e3bb8e1d239fbd21df5a10150d7cf97cddf2ac60f255e7c95e0444372270a590.log
2024-04-04T10:33:23.014467147Z stderr F W0404 10:33:23.014061       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:23.525905133Z stderr F W0404 10:33:23.525525       1 logging.go:59] [core] [Channel #3 SubChannel #4] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:23.650741563Z stderr F W0404 10:33:23.650270       1 logging.go:59] [core] [Channel #5 SubChannel #6] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:26.746781506Z stderr F W0404 10:33:26.746363       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:27.177688658Z stderr F W0404 10:33:27.177096       1 logging.go:59] [core] [Channel #5 SubChannel #6] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:27.255248965Z stderr F W0404 10:33:27.254843       1 logging.go:59] [core] [Channel #3 SubChannel #4] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:32.660451075Z stderr F W0404 10:33:32.660277       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:32.928362209Z stderr F W0404 10:33:32.927956       1 logging.go:59] [core] [Channel #5 SubChannel #6] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:34.194906969Z stderr F W0404 10:33:34.194549       1 logging.go:59] [core] [Channel #3 SubChannel #4] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
2024-04-04T10:33:38.191445386Z stderr F F0404 10:33:38.190882       1 instance.go:290] Error creating leases: error creating storage factory: context deadline exceeded
$ nerdctl exec usernetes-node-1 sh -euxc 'tail /var/log/containers/etcd*'
+ tail /var/log/containers/etcd-u7s-suda-ws01_kube-system_etcd-5e02fa990f84a688cc176a9c61737122ca6bcaa5545e3629ad824977f582365f.log
2024-04-04T10:32:16.394968121Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.394467Z","caller":"embed/serve.go:103","msg":"ready to serve client requests"}
2024-04-04T10:32:16.395555112Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.395204Z","caller":"embed/serve.go:103","msg":"ready to serve client requests"}
2024-04-04T10:32:16.395949052Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.395687Z","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
2024-04-04T10:32:16.395974454Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.39574Z","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}
2024-04-04T10:32:16.397006609Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.394859Z","caller":"etcdserver/server.go:2571","msg":"setting up initial cluster version using v2 API","cluster-version":"3.5"}
2024-04-04T10:32:16.398486722Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.398168Z","caller":"membership/cluster.go:584","msg":"set initial cluster version","cluster-id":"3f59b4f74cf82b90","local-member-id":"f3b2aef06a662d72","cluster-version":"3.5"}
2024-04-04T10:32:16.399161396Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.398924Z","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"}
2024-04-04T10:32:16.399912362Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.399745Z","caller":"etcdserver/server.go:2595","msg":"cluster version is updated","cluster-version":"3.5"}
2024-04-04T10:32:16.403525952Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.403166Z","caller":"embed/serve.go:250","msg":"serving client traffic securely","traffic":"grpc+http","address":"127.0.0.1:2379"}
2024-04-04T10:32:16.405592265Z stderr F {"level":"info","ts":"2024-04-04T10:32:16.405178Z","caller":"embed/serve.go:250","msg":"serving client traffic securely","traffic":"grpc+http","address":"10.100.201.100:2379"}
$ nerdctl exec usernetes-node-1 ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 10:30 ?        00:00:01 /sbin/init
root         108       1  0 10:30 ?        00:00:00 /lib/systemd/systemd-journald
root         121       1  0 10:30 ?        00:00:00 /lib/systemd/systemd-udevd
root         129       1 10 10:30 ?        00:00:26 /usr/local/bin/containerd
root         149       0  1 10:30 pts/1    00:00:02 kubeadm init --config /tmp/kubeadm-config.yaml --skip-token-print
root         388       1  3 10:32 ?        00:00:04 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.9 --runtime-cgroups=/system.slice/containerd.service --cloud-provider=external --node-labels=usernetes/host-ip=192.168.60.11
root         443       1  0 10:32 ?        00:00:00 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id e0baeca2362cf60e66347e1bca8e1663b0bc33f173f049cc4c288cd31f0a021f -address /run/containerd/containerd.sock
root         453       1  0 10:32 ?        00:00:00 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 4862c3b92124466a817f12e7ce4882e3602f01e73ff70c68157d758d26777122 -address /run/containerd/containerd.sock
root         487       1  0 10:32 ?        00:00:00 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id aaa4c48903c861beda8db0d2acc6ccddf7006032512d948df01a9d9009b97657 -address /run/containerd/containerd.sock
root         508       1  0 10:32 ?        00:00:00 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id e8179f9642fbd7b705a4c221d2efd167cebeabc47863b90f2966a16bee0f3ca0 -address /run/containerd/containerd.sock
65535        536     443  0 10:32 ?        00:00:00 /pause
65535        547     453  0 10:32 ?        00:00:00 /pause
65535        560     487  0 10:32 ?        00:00:00 /pause
65535        576     508  0 10:32 ?        00:00:00 /pause
root         680     443  0 10:32 ?        00:00:01 kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/kubernetes/pki/ca.crt --cloud-provider=external --cluster-cidr=10.244.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --leader-elect=true --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/16 --use-service-account-credentials=true
root         691     487  1 10:32 ?        00:00:02 kube-scheduler --authentication-kubeconfig=/etc/kubernetes/scheduler.conf --authorization-kubeconfig=/etc/kubernetes/scheduler.conf --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/scheduler.conf --leader-elect=true
root         815     508  1 10:32 ?        00:00:01 etcd --advertise-client-urls=https://10.100.201.100:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --experimental-initial-corrupt-check=true --experimental-watch-progress-notify-interval=5s --initial-advertise-peer-urls=https://10.100.201.100:2380 --initial-cluster=u7s-suda-ws01=https://10.100.201.100:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://10.100.201.100:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://10.100.201.100:2380 --name=u7s-suda-ws01 --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
root        1111       0  0 10:34 ?        00:00:00 ps -ef

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions