-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
What version of nebula are you using? (nebula -version)
1.9.4.r107.g36c890e
What operating system are you using?
Arch Linux
Describe the Bug
I have confirmed that with the exact same configuration, v1.9.7 is not affected. That is, when this problem occurs, if I downgrade nebula to v1.9.7 on the server side, the problem disappears. Therefore, I believe it is a bug introduced recently.
Setup
On the server side, create a dummy network interface: sudo ip l add dummy0 type dummy, then assign an IP address to it: sudo ip a add 192.168.50.1/32 dev dummy0. Sign a nebula certificate for the server, including 192.168.50.1/32 as a subnet.
On the client side, register 192.168.50.1 as an unsafe_route via the server's nebula IP. Ping 192.168.50.1. When the server's outbound_action is reject, the client gets ICMP Destination Port Unreachable messages (note "Destination Port Unreachable" - this is definitely sent by nebula). When the server's outbound_action is drop, there is no reply. I have configured the nebula firewall to allow all traffic, so this cannot be a firewall configuration error. Also, there are no iptables rules and I set all default policies to ACCEPT.
A tcpdump on the server side listening to the nebula interface shows nothing regardless of outbound_action configuration. When the server's outbound_action is set to reject, a tcpdump on the client side shows immediate ICMP 192.168.50.1 protocol 1 port 52215 unreachable, length 36 for ICMP echo requests, and immediate RST's for TCP SYN's.
Side talk
I know this is not a very standard use of unsafe_routes, but I find it useful for setting up a HA Kubernetes cluster. #1332 made it possible to have load balancing for kube-apiserver. I am using the same "virtual IP" on all control plane nodes, and leveraging nebula's ECMP to distribute traffic to this virtual IP. While this is not optimal, it is not possible to run VRRF over nebula, and this looks like a decent workaround.
Logs from affected hosts
Really nothing gets printed to the logs during the issue.
Config files from affected hosts
On my laptop:
pki:
ca: /etc/nebula/ca.crt
cert: /etc/nebula/host.crt
key: /etc/nebula/host.key
static_host_map:
"10.0.7.1": ["<lighthouse>:4242"]
lighthouse:
am_lighthouse: false
interval: 10
remote_allow_list:
"::/0": false
"0.0.0.0/0": true
hosts:
- "10.0.7.1"
listen:
host: "[::]"
port: 0
punchy:
punch: true
respond: true
relay:
relays:
- 10.0.7.1
am_relay: false
use_relays: true
tun:
disabled: false
dev: neb0
mtu: 1300
unsafe_routes:
- route: 192.168.50.1/32
via: 10.0.7.20
mtu: 1300
register: true
firewall:
outbound:
- port: any
proto: any
host: any
inbound:
- port: any
proto: any
host: any
On the server, the config is the same, except the absence of unsafe_routes.