Skip to content

[bug] machine reconnect after omni downtime #638

Open
@githubcdr

Description

@githubcdr

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When upgrading omni I sometime notice that Siderolinks seem to go down for a long time. Nodes are not available during this downtime.

Expected Behavior

Node repair (restore) connections to omni in case of unexpected disconnects. I would expect at least a reconnect on failure from nodes.

Steps To Reproduce

Bring Omni 0.42.3 down for a few minutes and restart, some nodes are not available/connected.

{"level":"warn","ts":1726598097.5886366,"caller":"device/send.go:138","msg":"peer(jJtQ…3034) - Failed to send handshake initiation: no known endpoint for peer","component":"server","component":"siderolink"}
{"level":"warn","ts":1726598098.365197,"caller":"device/send.go:138","msg":"peer(gYOs…LsGU) - Failed to send handshake initiation: no known endpoint for peer","component":"server","component":"siderolink"}
{"level":"warn","ts":1726598098.4972,"caller":"device/send.go:138","msg":"peer(OyBh…3QWc) - Failed to send handshake initiation: no known endpoint for peer","component":"server","component":"siderolink"}
{"level":"warn","ts":1726598098.510401,"caller":"device/send.go:138","msg":"peer(aXTs…pziA) - Failed to send handshake initiation: no known endpoint for peer","component":"server","component":"siderolink"}

What browsers are you seeing the problem on?

No response

Anything else?

A reboot command to the node sometimes works.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions