Skip to content

KubeSpan connection broken after network outage #13160

Description

@gjabell

Bug Report

Description

I have the following setup for a small Talos cluster:

  • 3 control-plane/worker nodes on a private home network (i.e. non-public IPs)
  • 1 worker node in Hetzner cloud (with a public IP)

I'm using KubeSpan to connect everything together; the cloud node has a firewall with 51820/UDP allowed which satisfies the requirement of one node being publically accessible.

Until recently this setup worked flawlessly, however since a brief network outage one of the controllers cannot contact the worker & vice-versa; running talosctl get kubespanpeerstatuses on both sides shows the other node as "down". The other two controllers connect successfully to the worker, and all controllers connect to each other via KubeSpan.

I've tried the following:

  • Restarting each node
  • Adding a KubeSpan filter on the controllers to only publish the private IP addresses; since the public IP is not accessible anyways

I imagine this is some strange firewall issue but the fact that two out of three nodes are able to successfully connect is odd.

Logs

Nothing of note in the logs except some warnings, which I assume happen during startup as the peers are established:

user: warning: [2026-04-20T12:01:29.409948029Z]: [talos] reconfigured wireguard link {"component": "controller-runtime", "controller": "network.LinkSpecController", "link": "kubespan", "peers": 3}

Are there other logs which I should check?

Environment

  • Talos version: v1.12.6
  • Kubernetes version: v1.35.2
  • Platform: Bare-metal + Hetzner (cloud)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions