Skip to content

Kube-proxy doesn't remove stale CNI-HOSTPORT-DNAT rule after Kubernetes upgrade to 1.26 #3440

@raelix

Description

@raelix

RKE version:

v1.4.6

Docker version: (docker version,docker info preferred)
20.10.24

Operating system and kernel: (cat /etc/os-release, uname -r preferred)
NAME="Red Hat Enterprise Linux"
VERSION="8.6 (Ootpa)"

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Openstack

cluster.yml file:

nodes:
    - address: 10.94.1.8
      internal_address: 172.16.10.53
      ssh_key_path: /home/user/.ssh/id_rsa
      user: user
      role:
        - controlplane
        - etcd
        - worker
ignore_docker_version: true
enable_cri_dockerd: true
cluster_name: mycluster
kubernetes_version: v1.26.4-rancher2-1
network:
  plugin: flannel
ingress:
  provider: nginx

Steps to Reproduce:
Source versions -> rke: v1.4.3 - kubernetes_version: v1.23.10-rancher1-1
Dest versions -> rke: v.1.4.6 - kubernetes_version: v1.26.4-rancher2-1

Trying to upgrade Kubernetes with RKE from v1.23.10 to v1.26.4 I was not able anymore to reach my ingresses through the nginx-ingress-controller which listens on hostPort 80 and 443.

I investigated further and I found that the CNI-HOSTPORT-DNAT Chain had still the old entry.

Before the upgrade:

[root@rancher user]# iptables -t nat -L CNI-HOSTPORT-DNAT  --line-numbers 
Chain CNI-HOSTPORT-DNAT (2 references)
num  target     prot opt source               destination         
1    CNI-DN-ff3905f57536228de6b29  tcp  --  anywhere             anywhere             /* dnat name: "cbr0" id: "e96c642e169acf789be84e8fbcd0e5c3da1a53c8e8a459227c06bdf423deb482" */ multiport dports http,https

[root@rancher user]# iptables -t nat -L CNI-DN-ff3905f57536228de6b29  --line-numbers 
Chain CNI-DN-ff3905f57536228de6b29 (1 references)
num  target     prot opt source               destination         
1    CNI-HOSTPORT-SETMARK  tcp  --  rancher.internal.com/24  anywhere             tcp dpt:http
2    CNI-HOSTPORT-SETMARK  tcp  --  localhost            anywhere             tcp dpt:http
3    DNAT       tcp  --  anywhere             anywhere             tcp dpt:http to:10.42.0.7:80
4    CNI-HOSTPORT-SETMARK  tcp  --  rancher.internal.com/24  anywhere             tcp dpt:https
5    CNI-HOSTPORT-SETMARK  tcp  --  localhost            anywhere             tcp dpt:https
6    DNAT       tcp  --  anywhere             anywhere             tcp dpt:https to:10.42.0.7:443

This looks good.

After the upgrade:

[root@rancher user]# iptables -t nat -L CNI-HOSTPORT-DNAT  --line-numbers 
Chain CNI-HOSTPORT-DNAT (2 references)
num  target     prot opt source               destination         
1    CNI-DN-ff3905f57536228de6b29  tcp  --  anywhere             anywhere             /* dnat name: "cbr0" id: "e96c642e169acf789be84e8fbcd0e5c3da1a53c8e8a459227c06bdf423deb482" */ multiport dports http,https
2    CNI-DN-4c3eba344b3e2fffe3698  tcp  --  anywhere             anywhere             /* dnat name: "cbr0" id: "aa3202e02a9fefbc97400df0685d864bc3894a580d4b2069542621371e1cfde8" */ multiport dports http,https

The second entry is the right one which points to the new pod IP but the first one should not be there.
Looks like kube-proxy doesn't delete the old entry making it impossible to access the ingresses.

As workaround I had to reboot the server or delete manually the entry:
iptables -t nat -D CNI-HOSTPORT-DNAT 1

Results:
After the upgrade can't talk to the ingress controller listening on hostPort.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions