BUG: data race between control plane components and host modification

### Sealos Version

v5.0.1

### How to reproduce the bug?

I attempted to test how Sealos recovers after a failure of the master0 node. During this, I discovered an issue: on control plane nodes other than master0, the host file on the host machine and the host file inside the pods are inconsistent. For example, in the controller-manager pod, when master0 becomes unavailable, controller-manager fails. The same behavior is observed in other control plane components as well.
### Reproduce
The cluster is configured with three control plane nodes: master1, master2, and master3, and one worker node node1. The cluster was started with the following command:
```sh
root@master1:~# sealos gen registry.cn-shanghai.aliyuncs.com/labring/kubernetes:v1.29.9 registry.cn-shanghai.aliyuncs.com/labring/helm:v3.9.4 registry.cn-shanghai.aliyuncs.com/labring/cilium:v1.13.4 \
     --masters 192.168.64.15,192.168.64.16,192.168.64.17 \
     --nodes 192.168.64.18 \
     -u root --pk='/root/.ssh/multipass_key' \
     --output Clusterfile
sealos apply -f Clusterfile
```
Then shutdown the master1. Below is the output from controller-manager on master2 after master1 (master0) went offline. As you can see, the DNS resolution of apiserver.cluster.local points to master1 (master0).
```sh
root@master2:~# kubectl logs -n kube-system kube-controller-manager-master2 | tail -n10
E0526 13:58:46.446458       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:58:52.590254       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:58:58.736356       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:01.811818       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:04.879391       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:11.023681       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:14.104059       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:17.166370       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:20.240788       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
E0526 13:59:26.395983       1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://apiserver.cluster.local:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 192.168.64.15:6443: connect: no route to host
```
By directly inspecting the hosts file under /var/lib/kubelet/pods, I confirmed this was indeed the case.
```sh
root@master2:~# grep -i "apiserver.cluster.local" -r /var/lib/kubelet/pods/
/var/lib/kubelet/pods/56a1a6061487d03b440de1b2e6d4cba5/etc-hosts:192.168.64.15 apiserver.cluster.local
/var/lib/kubelet/pods/e669d082174b1b8e93a3b80fa1a4a2b9/etc-hosts:192.168.64.15 apiserver.cluster.local
/var/lib/kubelet/pods/14812d81b7c93b918d4faed7ae4a6dcf/etc-hosts:192.168.64.15 apiserver.cluster.local
/var/lib/kubelet/pods/a27a8174-6ac0-4f5d-81fe-17a05a525c43/etc-hosts:192.168.64.15 apiserver.cluster.local
/var/lib/kubelet/pods/a27a8174-6ac0-4f5d-81fe-17a05a525c43/volumes/kubernetes.io~configmap/kube-proxy/..2025_05_26_02_51_19.923862820/kubeconfig.conf:    server: https://apiserver.cluster.local:6443
/var/lib/kubelet/pods/a9d78507a0d74897c9c340c682a3413f/etc-hosts:192.168.64.15 apiserver.cluster.local
/var/lib/kubelet/pods/cfb54c6d-8bd4-4533-b2e1-8b9b74d47e43/etc-hosts:192.168.64.16 apiserver.cluster.local
root@master2:~# ls /var/lib/kubelet/pods/*/containers
/var/lib/kubelet/pods/14812d81b7c93b918d4faed7ae4a6dcf/containers:
kube-scheduler

/var/lib/kubelet/pods/362d9138-58c5-4961-b8b9-39d9aa57f8d7/containers:
coredns

/var/lib/kubelet/pods/56a1a6061487d03b440de1b2e6d4cba5/containers:
kube-controller-manager

/var/lib/kubelet/pods/a27a8174-6ac0-4f5d-81fe-17a05a525c43/containers:
kube-proxy

/var/lib/kubelet/pods/a9d78507a0d74897c9c340c682a3413f/containers:
etcd

/var/lib/kubelet/pods/cfb54c6d-8bd4-4533-b2e1-8b9b74d47e43/containers:
apply-sysctl-overwrites  cilium-agent  clean-cilium-state  config  install-cni-binaries  mount-bpf-fs  mount-cgroup

/var/lib/kubelet/pods/e669d082174b1b8e93a3b80fa1a4a2b9/containers:
kube-apiserver

/var/lib/kubelet/pods/e9afef92-76ae-461a-9fde-f105f5521e07/containers:
coredns
```
We can see that pod 56a1a6061487d03b440de1b2e6d4cba5 is kube-controller-manager with host record
192.168.64.15 apiserver.cluster.local.
However, the host machine’s /etc/hosts file looks like this:
```sh
root@master2:~# cat /etc/hosts
# Your system has configured 'manage_etc_hosts' as True.
# As a result, if you wish for changes to this file to persist
# then you will need to either
# a.) make changes to the master file in /etc/cloud/templates/hosts.debian.tmpl
# b.) change or remove the value of 'manage_etc_hosts' in
#     /etc/cloud/cloud.cfg or cloud-config from user-data
#
127.0.1.1 master2 master2
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.64.15 sealos.hub
192.168.64.16 apiserver.cluster.local
```
And not all control plane component pods have their apiserver.cluster.local entry pointing to master1 (master0) — some point to the host machine instead.
After reading parts of Sealos’ source code, I suspect this inconsistency is caused by a data race. During the initialization of control plane nodes, Sealos modifies the host machine’s /etc/hosts file twice: the first time during the init phase, it checks whether the node is a master, and if so, it points to master0. This makes sense because during join master, the node locates master0 through apiserver.cluster.local. The second modification happens after the kubeadm join command completes, at which point it only waits for the API server to become ready, but not all control plane components.
The first host modification in init stage
```go
defaultInitializers = append(defaultInitializers, &registryHostApplier{}, &registryApplier{}, &defaultCRIInitializer{}, &apiServerHostApplier{}, &lvscareHostApplier{}, &defaultInitializer{})

func (a *apiServerHostApplier) Apply(ctx Context, host string) error {
    if slices.Contains(ctx.GetCluster().GetMasterIPAndPortList(), host) {
    	if err := ctx.GetRemoter().HostsAdd(host, ctx.GetCluster().GetMaster0IP(), constants.DefaultAPIServerDomain); err != nil {
    		return fmt.Errorf("failed to add hosts: %v", err)
    	}
    	return nil
    }
    if err := ctx.GetRemoter().HostsAdd(host, ctx.GetCluster().GetVIP(), constants.DefaultAPIServerDomain); err != nil {
    	return fmt.Errorf("failed to add hosts: %v", err)
    }
    
    return nil
}
```
The second host modification after kubeadm join
```go
func (k *KubeadmRuntime) joinMasters(masters []string) error {
    // ...
    err = k.sshCmdAsync(master, joinCmd)
    if err != nil {
    	return fmt.Errorf("exec kubeadm join in %s failed %v", master, err)
    }
    
    err = k.execHostsAppend(master, master, k.getAPIServerDomain())
    if err != nil {
    	return fmt.Errorf("add master0 apiserver domain hosts in %s failed %v", master, err)
    }
    // ...
}
```
The possible cases
```
case1 apiserver.cluster.local points to master0: first host modification --> pod start --> second host modification
case2 apiserver.cluster.local points to host machine: first host modification --> second host modification --> pod start 
```
Here is the doc about behavior of [kubeadm join](https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#feature-gates)
```
Without the feature gate enabled, kubeadm will only wait for the kube-apiserver on a control plane node to become ready. 
The wait process starts right after the kubelet on the host is started by kubeadm. 
You are advised to enable this feature gate in case you wish to observe a ready state from all control plane components 
during the kubeadm init or kubeadm join command execution.
```

### What is the expected behavior?

_No response_

### What do you see instead?

_No response_

### Operating environment

```markdown
- Sealos version:
- Docker version:
- Kubernetes version:
- Operating system:
- Runtime environment:
- Cluster size:
- Additional information:
```

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: data race between control plane components and host modification #5614

Sealos Version

How to reproduce the bug?

Reproduce

What is the expected behavior?

What do you see instead?

Operating environment

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

BUG: data race between control plane components and host modification #5614

Description

Sealos Version

How to reproduce the bug?

Reproduce

What is the expected behavior?

What do you see instead?

Operating environment

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions