-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
I reported in https://github.com/k3s-io/k3s issues ( k3s-io/k3s#11476 ), but got forwarded here.
The flannel vlan in vxlan mode using another tunnel (such as manually configured wireguard) using k3s --flannel-external-ip and --flannel-iface parameters picks vxlan udp src ip from interface with default route.
Expected Behavior
The flannel vxlan udp packet src ip matches picked interface.
Current Behavior
This bug is noticed when you have two nodes with at least one under nat :
node 1 - server node:
eth0 192.168.66.200 (with default route)
wg0 172.30.0.1 (vpn to node2)node 2 - just agent:
eth2 x.x.x.x (redacted, default route)
wg0 172.30.0.2 (vpn to node1)Starting k3s on both nodes:
./k3s server -i 172.30.0.1 --node-external-ip 172.30.0.1 --flannel-backend=vxlan --flannel-external-ip 172.30.0.1 --flannel-iface wg0 --bind-address 172.30.0.1and node2
./k3s agent -t xxxtokenxxx --server https://172.30.0.1:6443 -i 172.30.0.2 --node-external-ip 172.30.0.2 --flannel-iface wg0 --bind-address 172.30.0.2 the flannel tunnel from node1 uses wrong src ip. For example doing ping from node2 to node1 the following packets are produced:
node1 # tcpdump -i wg0 -n udp port 8472
dropped privs to pcap
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wg0, link-type RAW (Raw IP), snapshot length 262144 bytes
16:58:21.074052 IP 172.30.0.2.40978 > 172.30.0.1.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.0 > 10.42.0.1: ICMP echo request, id 44022, seq 3, length 64
16:58:21.074128 IP 192.168.66.200.55307 (!!! need to be 172.30.0.1!!! ) > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.1 > 10.42.1.0: ICMP echo reply, id 44022, seq 3, length 64
16:58:22.097767 IP 172.30.0.2.40978 > 172.30.0.1.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.0 > 10.42.0.1: ICMP echo request, id 44022, seq 4, length 64
16:58:22.097811 IP 192.168.66.200.55307 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.1 > 10.42.1.0: ICMP echo reply, id 44022, seq 4, length 64
in above packet tcpdump we see not correct source ip address from node1. It need to be 172.30.0.1, but is 192.168.66.200
after some digging it is evident that node1 using from iface for vxlan:
node1:
node1 /sys/devices/virtual/net/flannel.1 # ls -l
total 0
### chopped ls -l ###
--rw-r--r-- 1 root root 4096 Dec 18 17:01 flags
-rw-r--r-- 1 root root 4096 Dec 18 17:01 gro_flush_timeout
-rw-r--r-- 1 root root 4096 Dec 18 17:01 ifalias
-r--r--r-- 1 root root 4096 Dec 18 17:01 ifindex
-r--r--r-- 1 root root 4096 Dec 18 16:56 iflink
-r--r--r-- 1 root root 4096 Dec 18 17:01 link_mode
lrwxrwxrwx 1 root root 0 Dec 18 17:01 lower_eth0 -> ../../../pci0000:00/0000:00:01.3/0000:09:00.2/0000:0a:03.0/0000:0e:00.0/net/eth0
-rw-r--r-- 1 root root 4096 Dec 18 17:01 mtu
-r--r--r-- 1 root root 4096 Dec 18 16:56 name_assign_type
### chopped ls -l ###as for exmaple node2 flannel.1 directory shows correct iface:
node2 /sys/devices/virtual/net/flannel.1 # ls -l
total 0
### chopped ls -l ###
-rw-r--r-- 1 root root 4096 Dec 18 17:02 ifalias
-r--r--r-- 1 root root 4096 Dec 18 16:58 ifindex
-r--r--r-- 1 root root 4096 Dec 18 16:58 iflink
-r--r--r-- 1 root root 4096 Dec 18 17:02 link_mode
lrwxrwxrwx 1 root root 0 Dec 18 16:58 lower_wg0 -> ../wg0
-rw-r--r-- 1 root root 4096 Dec 18 17:02 mtu
### chopped ls -l ###
Possible Solution
Steps to Reproduce (for bugs)
- Have at least system one system behind nat, but both systems with restrictive firewall
- install and configure tunnel between nodes (for example wireguard)
- Installed k3s (download from release page [ and just
chmod +x) - started manually k3s agent & k3s server with parameters above and other left in default state
- ping 10.42.0.1 or 10.42.0.2 from opposite node
Context
The context: trying to play with k3s environment, when there are already tunnel between two nodes and these nodes have restrictive firewalls on public IP addresses.
Your Environment
- Flannel version: v0.25.7
- Backend used (e.g. vxlan or udp): vxlan
- Etcd version: v3.5.16
- Kubernetes version (if used): v1.31.4-rc1+k3s1 (https://github.com/k3s-io/k3s/releases/tag/v1.31.3%2Bk3s1)
- Operating System and version: Linux pc 6.10.1 SMP PREEMPT_DYNAMIC Thu Jul 25 12:47:49 EEST 2024 x86_64 GNU/Linux