Skip to content

Fix SwitchLB connectivity issues#6531

Closed
andriishestakov wants to merge 1 commit intokubeovn:masterfrom
andriishestakov:fix-switch-lb
Closed

Fix SwitchLB connectivity issues#6531
andriishestakov wants to merge 1 commit intokubeovn:masterfrom
andriishestakov:fix-switch-lb

Conversation

@andriishestakov
Copy link
Copy Markdown

  • Bug fixes

Copilot AI review requested due to automatic review settings March 26, 2026 23:49
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. bug Something isn't working labels Mar 26, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix SwitchLB VIP/FIP connectivity issues by changing how SwitchLB VIPs answer ARP and by introducing additional OVN resources (LBs + hairpin SNAT option) so FIPs targeting SwitchLB VIPs can reach backends reliably and stay in sync as endpoints change.

Changes:

  • Add OVN NB helpers to manage LogicalSwitchPort.options["arp_proxy"] as an IP list and to set LoadBalancer.options["hairpin_snat_ip"].
  • Update SwitchLB VIP handling to omit VIP IPs from the VIP LSP’s OVN addresses and instead ARP-proxy the VIP IPs on the LRP-facing switch port.
  • Extend OvnFip controller logic to create per-FIP router/switch LBs for SwitchLB VIP targets and re-enqueue affected FIPs on SwitchLBRule / EndpointSlice changes.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
pkg/util/const.go Adds a new annotation constant to persist FIP→internal-subnet info.
pkg/ovs/ovn-nb-logical_switch_port.go Adds helpers to add/remove IPs from LSP arp_proxy option.
pkg/ovs/ovn-nb-load_balancer.go Adds helper to set/unset LB hairpin_snat_ip option.
pkg/ovs/interface.go Extends OVS interfaces to expose the new NB client helpers.
pkg/controller/vip.go Changes SwitchLB VIP LSP creation and ARP behavior (move ARP responsibility to LRP switch port).
pkg/controller/switch_lb_rule.go Re-enqueues FIPs when SwitchLBRule changes to keep FIP LBs in sync.
pkg/controller/ovn_fip.go Implements SwitchLB VIP–targeted FIP LB programming + cleanup and adds FIP re-enqueue helpers.
pkg/controller/endpoint_slice.go Re-enqueues FIPs on backend changes for SwitchLBRule services.
mocks/pkg/ovs/interface.go Updates mocks to match new OVS interfaces.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +582 to +600
// Switch LB: handles same-subnet pod → EIP:port (hairpin) with SNAT to gateway
switchLBName := fmt.Sprintf("fip-%s-%s-sw", fipName, proto)
if err = c.OVNNbClient.CreateLoadBalancer(switchLBName, proto); err != nil {
return fmt.Errorf("create switch LB %s: %w", switchLBName, err)
}
// Remove any stale VIPs (from a previous EIP IP) before adding the current one.
if existingLB, lbErr := c.OVNNbClient.GetLoadBalancer(switchLBName, true); lbErr == nil && existingLB != nil {
for staleVip := range existingLB.Vips {
if staleVip != eipVip {
_ = c.OVNNbClient.LoadBalancerDeleteVip(switchLBName, staleVip, true)
}
}
}
if err = c.OVNNbClient.LoadBalancerAddVip(switchLBName, eipVip, backends...); err != nil {
return fmt.Errorf("add vip to switch LB %s: %w", switchLBName, err)
}
if err = c.OVNNbClient.SetLoadBalancerHairpinSnatIP(switchLBName, v4Eip); err != nil {
return fmt.Errorf("set hairpin_snat_ip on switch LB %s: %w", switchLBName, err)
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addFipVipLBRules takes subnetGateway but never uses it, and the inline comment says hairpin SNAT is "to gateway" while the code sets hairpin_snat_ip to v4Eip. This mismatch makes the intended behavior unclear and increases the risk of future regressions. Either remove the unused parameter and update the comment, or use subnetGateway (e.g., the subnet's v4 gateway IP) if that is the intended SNAT source.

Copilot uses AI. Check for mistakes.
Comment on lines +417 to +450
// AddLogicalSwitchPortArpProxyIP adds an IP to the arp_proxy option of a logical switch port.
// This is used to make the LRP switch port respond to ARP for the given IP with the LRP's MAC.
func (c *OVNNbClient) AddLogicalSwitchPortArpProxyIP(lspName, ip string) error {
lsp, err := c.GetLogicalSwitchPort(lspName, false)
if err != nil {
klog.Error(err)
return fmt.Errorf("get logical switch port %s: %w", lspName, err)
}
if lsp == nil {
err = fmt.Errorf("logical switch port %s not found", lspName)
klog.Error(err)
return err
}
if lsp.Options == nil {
lsp.Options = make(map[string]string)
}
existing := strings.Fields(lsp.Options["arp_proxy"])
for _, existingIP := range existing {
if existingIP == ip {
return nil
}
}
existing = append(existing, ip)
lsp.Options["arp_proxy"] = strings.Join(existing, " ")
op, err := c.UpdateLogicalSwitchPortOp(lsp, &lsp.Options)
if err != nil {
klog.Error(err)
return err
}
if err := c.Transact("lsp-update", op); err != nil {
klog.Error(err)
return fmt.Errorf("failed to add arp_proxy ip %s to logical switch port %s: %w", ip, lspName, err)
}
return nil
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddLogicalSwitchPortArpProxyIP/RemoveLogicalSwitchPortArpProxyIP introduce new behavior around the arp_proxy option, but there are existing unit tests for SetLogicalSwitchPortArpProxy. Please add/extend ovn-nb-logical_switch_port_test.go coverage for adding/removing IPs (including idempotency and interaction with existing arp_proxy values) to prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +271 to +297
// SetLoadBalancerHairpinSnatIP sets options:hairpin_snat_ip on a load balancer.
// OVN SNATs the source IP to `ip` when a client on the same subnet as a backend
// hits the LB VIP, ensuring replies traverse the router so conntrack can reverse
// the DNAT. Passing ip="" removes the option.
func (c *OVNNbClient) SetLoadBalancerHairpinSnatIP(lbName, ip string) error {
lb, err := c.GetLoadBalancer(lbName, false)
if err != nil {
klog.Error(err)
return fmt.Errorf("get lb %s for hairpin_snat_ip: %w", lbName, err)
}
if lb.Options["hairpin_snat_ip"] == ip {
return nil
}
options := make(map[string]string, len(lb.Options)+1)
maps.Copy(options, lb.Options)
if ip == "" {
delete(options, "hairpin_snat_ip")
} else {
options["hairpin_snat_ip"] = ip
}
lb.Options = options
if err = c.UpdateLoadBalancer(lb, &lb.Options); err != nil {
klog.Error(err)
return fmt.Errorf("set hairpin_snat_ip on lb %s: %w", lbName, err)
}
return nil
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetLoadBalancerHairpinSnatIP is new option-mutating behavior, and pkg/ovs already has unit tests for similar helpers like SetLoadBalancerAffinityTimeout. Please add ovn-nb-load_balancer_test.go coverage for setting, updating, and removing hairpin_snat_ip (including preserving other lb.Options keys).

Copilot uses AI. Check for mistakes.
if vip.Spec.Type != util.SwitchLBRuleVip {
continue
}
if vip.Status.V4ip != vipIP && vip.Spec.V4ip != vipIP {
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enqueueFipsForVipIP only matches the provided vipIP against vip.Status.V4ip and vip.Spec.V4ip. If SwitchLBRule.Spec.Vip (and thus the passed vipIP) is IPv6, this function will never enqueue any FIPs. Include v6 matching (vip.Status.V6ip / vip.Spec.V6ip) so IPv6 SwitchLBRule VIPs trigger re-sync correctly.

Suggested change
if vip.Status.V4ip != vipIP && vip.Spec.V4ip != vipIP {
if vip.Status.V4ip != vipIP && vip.Spec.V4ip != vipIP &&
vip.Status.V6ip != vipIP && vip.Spec.V6ip != vipIP {

Copilot uses AI. Check for mistakes.
if err = c.patchOvnFipSubnetAnnotation(key, subnetName); err != nil {
klog.Errorf("failed to patch subnet annotation for fip %s, %v", key, err)
return err
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the isSwitchLBVip path, only the IPv4 case (v4IP/v4Eip) calls addFipVipLBRules, and the regular NAT programming is entirely skipped. This means an IPv6-only (or v6) OvnFip targeting a switch_lb_vip VIP will not create any OVN connectivity rules. Either implement the equivalent v6 handling (LBs/NAT) or fall back to the existing NAT-based behavior for v6 so IPv6 FIPs keep working.

Suggested change
}
}
// For IPv6 (and mixed v4/v6) in switch_lb_vip mode, fall back to the
// standard NAT-based behavior so IPv6 FIPs keep working.
var stateless bool
if cachedFip.Spec.Type != "" {
stateless = (cachedFip.Spec.Type == kubeovnv1.GWDistributedType)
} else {
stateless = (c.ExternalGatewayType == kubeovnv1.GWDistributedType)
}
options := map[string]string{"stateless": strconv.FormatBool(stateless)}
// support v6:v6
if v6IP != "" && v6Eip != "" {
if err = c.OVNNbClient.AddNat(vpcName, ovnnb.NATTypeDNATAndSNAT, v6Eip, v6IP, mac, cachedFip.Spec.IPName, options); err != nil {
klog.Errorf("failed to create v6:v6 fip, %v", err)
return err
}
}
// support v4:v6
if v4IP != "" && v6IP == "" && v4Eip == "" && v6Eip != "" {
if err = c.OVNNbClient.AddNat(vpcName, ovnnb.NATTypeDNATAndSNAT, v6Eip, v4IP, mac, cachedFip.Spec.IPName, options); err != nil {
klog.Errorf("failed to create v4:v6 fip, %v", err)
return err
}
}
// support v6:v4
if v6IP != "" && v4IP == "" && v6Eip == "" && v4Eip != "" {
if err = c.OVNNbClient.AddNat(vpcName, ovnnb.NATTypeDNATAndSNAT, v4Eip, v6IP, mac, cachedFip.Spec.IPName, options); err != nil {
klog.Errorf("failed to create v6:v4 fip, %v", err)
return err
}
}

Copilot uses AI. Check for mistakes.
}

// Re-enqueue any OvnFips targeting this VIP so their LB backends stay in sync.
c.enqueueFipsForVipIP(slr.Name)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enqueueFipsForVipIP expects a VIP IP address, but this call passes slr.Name (the SwitchLBRule resource name). This will prevent OvnFips targeting the SwitchLBRule VIP from being re-synced when the rule updates. Pass slr.Spec.Vip (or the VIP IP string you validated earlier) instead.

Suggested change
c.enqueueFipsForVipIP(slr.Name)
c.enqueueFipsForVipIP(slr.Spec.Vip)

Copilot uses AI. Check for mistakes.
// switch port (subnet-vpc), OVN generates a priority=30 ARP flow responding with the
// LRP's MAC. Pods then send dl_dst=LRP_MAC, and after DNAT the L2 lookup correctly
// routes packets through the LRP to the backend.
mac = util.GenerateMac()
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For SwitchLBRuleVip, mac is allocated by IPAM (acquireIPAddress/acquireStaticIPAddress) but then overwritten with util.GenerateMac(). This desynchronizes VIP.Status.Mac from the MAC stored in IPAM (see init.go rehydrating IPAM from VIP CR), which can lead to MAC reuse/collisions after restart. Use the MAC returned by IPAM, or if you must generate a MAC here, pass it into the IPAM allocation so both sources stay consistent.

Suggested change
mac = util.GenerateMac()

Copilot uses AI. Check for mistakes.
Comment on lines +472 to +482
if cachedFip.Spec.IPType == util.Vip && cachedFip.Spec.IPName != "" {
if vip, err := c.virtualIpsLister.Get(cachedFip.Spec.IPName); err == nil {
isSwitchLBVip = (vip.Spec.Type == util.SwitchLBRuleVip)
}
}
if !isSwitchLBVip && cachedFip.Annotations[util.FipInternalSubnetAnnotation] != "" {
isSwitchLBVip = true
}

if isSwitchLBVip {
subnetName := cachedFip.Annotations[util.FipInternalSubnetAnnotation]
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanOvnFipOvnResources determines isSwitchLBVip by looking up the VIP, but then always uses the stored annotation to derive subnetName. If the VIP exists but the annotation hasn't been patched yet (or patch failed), subnetName will be empty and delFipVipLBRules will skip detaching switch LBs from the subnet, which can cause LB deletion/cleanup to fail or leak. When the VIP lookup succeeds, use vip.Spec.Subnet as a fallback when the annotation is missing.

Suggested change
if cachedFip.Spec.IPType == util.Vip && cachedFip.Spec.IPName != "" {
if vip, err := c.virtualIpsLister.Get(cachedFip.Spec.IPName); err == nil {
isSwitchLBVip = (vip.Spec.Type == util.SwitchLBRuleVip)
}
}
if !isSwitchLBVip && cachedFip.Annotations[util.FipInternalSubnetAnnotation] != "" {
isSwitchLBVip = true
}
if isSwitchLBVip {
subnetName := cachedFip.Annotations[util.FipInternalSubnetAnnotation]
subnetName := ""
if cachedFip.Spec.IPType == util.Vip && cachedFip.Spec.IPName != "" {
if vip, err := c.virtualIpsLister.Get(cachedFip.Spec.IPName); err == nil {
if vip.Spec.Type == util.SwitchLBRuleVip {
isSwitchLBVip = true
// Use the VIP's subnet as a fallback if the annotation is missing.
subnetName = vip.Spec.Subnet
}
}
}
if !isSwitchLBVip && cachedFip.Annotations[util.FipInternalSubnetAnnotation] != "" {
isSwitchLBVip = true
subnetName = cachedFip.Annotations[util.FipInternalSubnetAnnotation]
}
if isSwitchLBVip {

Copilot uses AI. Check for mistakes.
@oilbeater
Copy link
Copy Markdown
Collaborator

@andriishestakov please sign off the commit and add the pr description.

OVN's ls_in_pre_lb (priority 110) bypasses switch load-balancers for
traffic arriving from a logical router port, so a regular dnat_and_snat
NAT rule that rewrites the external EIP to a switch_lb_vip VIP results in
the VIP's switch LB never firing for external/node-originated traffic.

Fix by creating dedicated per-FIP load-balancers instead of NAT rules:

Router LB (fip-<name>-<proto>): attached to the VPC router, handles
  external and node→EIP:port traffic by directly load-balancing to pod
  backends.  This fires at LR priority 120, before any NAT rule (100).

Switch LB (fip-<name>-<proto>-sw): attached to the VIP's subnet switch,
  handles same-subnet pod→EIP:port (hairpin) traffic with
  options:hairpin_snat_ip set to the EIP so reply traffic traverses the
  router and conntrack unwinds the DNAT correctly.

An ARP-proxy dnat_and_snat NAT rule (no logical_port) is kept on the VPC
router so the external LRP responds to ARP for the EIP.

Additional changes:
- cleanOvnFipOvnResources(): shared helper for deletion in both the
  update (finalizer removal) and delete handlers; detects switch_lb_vip
  FIPs via VIP type or the stored FipInternalSubnetAnnotation fallback so
  cleanup works even after the VIP CR is gone.
- enqueueFipsForVipIP(): re-queues OvnFips whose VIP IP matches (v4 or
  v6) when SwitchLBRule endpoints change; callers now correctly pass
  slr.Spec.Vip (the IP) instead of slr.Name.
- vip.go: switch_lb_vip LSP is created without IP in OVN addresses and
  uses the IPAM-allocated MAC (not a freshly generated one) to keep IPAM
  and OVN in sync.  VIP IP is instead added to the LRP switch port's
  arp_proxy option so OVN generates priority-30 ARP flows using the LRP
  MAC, ensuring post-DNAT L2 lookup routes packets through the LRP.
- AddLogicalSwitchPortArpProxyIP / RemoveLogicalSwitchPortArpProxyIP:
  new OVN NB helpers for managing the arp_proxy IP list on an LSP.
- SetLoadBalancerHairpinSnatIP: new OVN NB helper to set/clear
  options:hairpin_snat_ip on a load balancer while preserving other keys.

Signed-off-by: Andrii Shestakov <andrii.shestakov@gcore.com>
@SkalaNetworks
Copy link
Copy Markdown
Member

I'd love to review this PR but it doesn't say what it does or what it is trying to fix.

@zbb88888
Copy link
Copy Markdown
Collaborator

zbb88888 commented Mar 28, 2026

Extend OvnFip controller logic to create per-FIP router/switch LBs for SwitchLB VIP targets and re-enqueue affected FIPs on SwitchLBRule / EndpointSlice changes.

@andriishestakov Hi, Thanks for the PR! but i have some opinions!

How about extending a new router LB and switch LB CRD for the LB things? Using OVN FIP is not a clean way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants