Describe the bug
two zenoh routers do not unregister each others liveliness tokens.
i traced this to zenoh::net::routing::hat::router::token::Hat::unregister_node_tokens - it successfully receives unregister_node_tokens(<expired ZID>), but self.net() has already had the graph pruned, and so remove_link doesn't find any zid's to unpropagate tokens.
i'm going to look into a patch now, i'll update this issue if i find the route cause.
To reproduce
running the following script on two devices connected by a thunderbolt 5 cable (one macos, one linux if that's relevant)
use zenoh::{Result, Wait};
#[tokio::main]
async fn main() -> Result<()> {
let mut cfg = zenoh::Config::default();
cfg.insert_json5("mode", "\"router\"")?;
cfg.insert_json5("listen/endpoints", &format!("[\"tcp/[::]:44444\"]"))?;
cfg.insert_json5(
"connect/endpoints",
"[
\"tcp/fe80::<device A LLA>%iface:44444\",
\"tcp/fe80::<device B LLA>%iface:44444\"
]",
)?;
let session = zenoh::open(cfg).wait()?;
let _tok = session
.liveliness()
.declare_token(session.zid().to_string())
.wait()?;
session
.liveliness()
.declare_subscriber("*")
.history(true)
.callback(|sample| println!("{}: {}", sample.kind(), sample.key_expr()))
.background()
.wait()?;
// wait until ctrl c
tokio::signal::ctrl_c().await?;
Ok(())
}
produces on A
PUT: <zid A>
PUT: <zid B>
...ctrl c on B...
DELETE: <zid B>
as expected, however when pulling the cable,
PUT: <zid A>
PUT: <zid B>
...cable pulled...
no such DELETE is ever propagated.
System info
- Platform: NixOS 25.5, macOS 26.3
- CPU: AMD AI 9 HX 370, Apple M4
- zenoh commit 77507c1, latest main at time of writing
- also seen on release v1.9.0
Describe the bug
two zenoh routers do not unregister each others liveliness tokens.
i traced this to
zenoh::net::routing::hat::router::token::Hat::unregister_node_tokens- it successfully receivesunregister_node_tokens(<expired ZID>), butself.net()has already had the graph pruned, and soremove_linkdoesn't find any zid's to unpropagate tokens.i'm going to look into a patch now, i'll update this issue if i find the route cause.
To reproduce
running the following script on two devices connected by a thunderbolt 5 cable (one macos, one linux if that's relevant)
produces on A
as expected, however when pulling the cable,
no such DELETE is ever propagated.
System info