Description
Kube-OVN Version
v1.13.4
Kubernetes Version
v1.28.6
Operation-system/Kernel Version
"Ubuntu 22.04.5 LTS" 6.8.0-47-generic
Description
Deleting a FIP triggers a reset of the associated EIP. Resetting an EIP sets status.ready: true
even if the EIP is not yet ready. The EIP update handler then does not program the EIP on the NAT GW.
We hit this following a NAT GW being rescheduled to a new node, this triggers all FIPs and EIPs to be marked as not ready so they can be programmed on the new pod. A FIP was deleted and recreated while all this was going on but the associated EIP was never added to the new gateway pod.
Steps To Reproduce
It's a race condition so it can be difficult to consistently recreate this bug.
- Create vpc nat gw
- Create a lot of EIPs and FIPs on vpc nat gw
- Delete vpc nat gw pod
- Delete all FIPs
- Exec onto vpc nat gw pod and check ip addresses on net1
Current Behavior
Deleting a FIP while an EIP is not yet programmed prevents the EIP ever being programmed.
Expected Behavior
Deleting a FIP while an EIP is not yet programmed has no impact on EIP being programmed.