Skip to content

Late joining subscriber doesn't get data #976

@justink-hadrian

Description

@justink-hadrian

Generated by Generative AI

Claude Opus 4.7

Operating System:

Ubuntu 24.04

ROS version or commit hash:

kilted

RMW implementation (if applicable):

rmw_zenoh

RMW Configuration (if applicable):

I have an Ubuntu 24.04 workstation using kilted with pixi and robostack, that is running the zenoh router without any special config. I have a Jetson Orin with hand compiled kilted using the attached config (scrubbed of host IP address)

scrubbed_zenoh_client_session_config.txt

Client library (if applicable):

No response

'ros2 doctor --report' output

ros2 doctor --report

$ pixi run ros2 doctor --report

NETWORK CONFIGURATION
inet : 127.0.0.1
inet4 : ['127.0.0.1']
inet6 : ['::1']
netmask : 255.0.0.0
device : lo
flags : UP,LOOPBACK,RUNNING
mtu : 65536
inet : XXX
inet4 : ['XXX']
ether : XXX
inet6 : ['XXX']
netmask : 255.255.254.0
device : enp128s31f6
flags : UP,BROADCAST,RUNNING,MULTICAST
mtu : 1500
broadcast : XXX
inet : XXX
inet4 : ['XXX]
ether : XXX
inet6 : ['XXX']
netmask : 255.255.252.0
device : wlp131s0f0
flags : UP,BROADCAST,RUNNING,MULTICAST
mtu : 1500
broadcast : XXX
inet : 172.17.0.1
inet4 : ['172.17.0.1']
ether : XXX
netmask : 255.255.0.0
device : docker0
flags : UP,BROADCAST,MULTICAST
mtu : 1500
broadcast : 172.17.255.255

PACKAGE VERSIONS
py_trees_ros_interfaces : latest=2.1.1, local=2.1.1
libstatistics_collector : latest=2.0.1, local=2.0.1
ament_pep257 : latest=0.19.3, local=0.19.2
osrf_pycommon : latest=2.1.6, local=2.1.6
ament_cpplint : latest=0.19.3, local=0.19.2
tf_transformations : latest=1.1.1, local=1.1.1
ament_cmake_gen_version_h : latest=2.7.5, local=2.7.5
zenoh_cpp_vendor : latest=0.6.6, local=0.6.6
ament_cmake_ros : latest=0.14.7, local=0.14.7
ament_cmake_ros_core : latest=0.14.7, local=0.14.7
pybind11_vendor : latest=3.2.0, local=3.2.0
tool_build : latest=N/A, local=0.0.0
abb_arm_ros : latest=N/A, local=0.0.0
robot_state_publisher : latest=3.4.3, local=3.4.3
tf2_ros : latest=0.41.7, local=0.41.6
uncrustify_vendor : latest=3.1.0, local=3.1.0
ament_uncrustify : latest=0.19.3, local=0.19.2
orocos_kdl_vendor : latest=0.7.1, local=0.7.1
map_msgs : latest=2.5.0, local=2.5.0
rosidl_cmake : latest=4.9.6, local=4.9.6
pluginlib : latest=5.6.3, local=5.6.2
ament_clang_format : latest=0.19.3, local=0.19.2
rcpputils : latest=2.13.5, local=2.13.5
abb_irb4600_support : latest=N/A, local=1.5.0
ament_cmake_core : latest=2.7.5, local=2.7.5
ros2service : latest=0.38.3, local=0.38.3
ament_cmake_cpplint : latest=0.19.3, local=0.19.2
ament_cppcheck : latest=0.19.3, local=0.19.2
ament_cmake_copyright : latest=0.19.3, local=0.19.2
control_msgs : latest=6.9.0, local=6.8.0
rviz_rendering : latest=15.0.12, local=15.0.12
rosidl_typesupport_interface : latest=4.9.6, local=4.9.6
rosidl_generator_cpp : latest=4.9.6, local=4.9.6
spdlog_vendor : latest=1.7.0, local=1.7.0
ros2pkg : latest=0.38.3, local=0.38.3
rmw_security_common : latest=7.8.2, local=7.8.2
ros2bag : latest=0.32.0, local=0.32.0
image_transport : latest=6.1.3, local=6.1.3
std_srvs : latest=5.5.2, local=5.5.2
rcl_yaml_param_parser : latest=10.1.4, local=10.1.4
rosidl_generator_type_description : latest=4.9.6, local=4.9.6
ament_flake8 : latest=0.19.3, local=0.19.2
weld_inspect : latest=N/A, local=0.0.0
ament_cmake_python : latest=2.7.5, local=2.7.5
ament_cmake_uncrustify : latest=0.19.3, local=0.19.2
ament_cmake_cppcheck : latest=0.19.3, local=0.19.2
mcap_vendor : latest=0.32.0, local=0.32.0
ament_cmake_lint_cmake : latest=0.19.3, local=0.19.2
rosbag2_storage_sqlite3 : latest=0.32.0, local=0.32.0
ament_lint_common : latest=0.19.3, local=0.19.2
launch_testing_ament_cmake : latest=3.8.7, local=3.8.7
sensor_msgs_py : latest=5.5.2, local=5.5.2
launch_pytest : latest=3.8.7, local=3.8.7
tf2_sensor_msgs : latest=0.41.7, local=0.41.6
rosidl_typesupport_fastrtps_cpp : latest=3.8.2, local=3.8.2
rcl : latest=10.1.4, local=10.1.4
ament_cmake_export_libraries : latest=2.7.5, local=2.7.5
rclpy : latest=9.1.5, local=9.1.5
ament_cmake : latest=2.7.5, local=2.7.5
ament_index_cpp : latest=1.11.4, local=1.11.3
rosidl_parser : latest=4.9.6, local=4.9.6
message_filters : latest=7.1.8, local=7.1.6
rosidl_default_runtime : latest=1.7.2, local=1.7.2
rosidl_typesupport_introspection_c : latest=4.9.6, local=4.9.6
rosbag2_compression_zstd : latest=0.32.0, local=0.32.0
rviz_assimp_vendor : latest=15.0.12, local=15.0.12
joint_state_publisher_gui : latest=2.4.1, local=2.4.1
action_msgs : latest=2.3.2, local=2.3.1
rclcpp_components : latest=29.5.8, local=29.5.7
ament_cmake_xmllint : latest=0.19.3, local=0.19.2
gz_utils_vendor : latest=0.2.2, local=0.2.2
interactive_markers : latest=2.7.1, local=2.7.1
sros2_cmake : latest=0.15.5, local=0.15.5
tracetools : latest=8.6.0, local=8.6.0
rosidl_runtime_cpp : latest=4.9.6, local=4.9.6
rmw_connextdds : latest=1.1.1, local=1.1.1
ament_cmake_pep257 : latest=0.19.3, local=0.19.2
rviz_ogre_vendor : latest=15.0.12, local=15.0.12
ament_lint_auto : latest=0.19.3, local=0.19.2
rosx_introspection : latest=2.3.0, local=2.2.0
launch_testing_ros : latest=0.28.5, local=0.28.5
ros_core : latest=0.12.0, local=0.12.0
visualization_msgs : latest=5.5.2, local=5.5.2
hdn_core : latest=N/A, local=0.0.0
ros_workspace : latest=1.0.3, local=1.0.3
rmw_fastrtps_dynamic_cpp : latest=9.3.4, local=9.3.3
ros2multicast : latest=0.38.3, local=0.38.3
ros2cli : latest=0.38.3, local=0.38.3
ament_cmake_target_dependencies : latest=2.7.5, local=2.7.5
rcl_lifecycle : latest=10.1.4, local=10.1.4
rclcpp : latest=29.5.8, local=29.5.7
common_interfaces : latest=5.5.2, local=5.5.2
ament_cmake_export_interfaces : latest=2.7.5, local=2.7.5
keyboard_handler : latest=0.4.0, local=0.4.0
rosidl_generator_rs : latest=0.4.12, local=0.4.11
ament_xmllint : latest=0.19.3, local=0.19.2
nav_msgs : latest=5.5.2, local=5.5.2
ament_lint_cmake : latest=0.19.3, local=0.19.2
ament_cmake_pytest : latest=2.7.5, local=2.7.5
rosidl_generator_c : latest=4.9.6, local=4.9.6
ros2action : latest=0.38.3, local=0.38.3
ament_cmake_export_dependencies : latest=2.7.5, local=2.7.5
std_msgs : latest=5.5.2, local=5.5.2
rmw_implementation : latest=3.0.7, local=3.0.6
octomap : latest=N/A, local=1.10.0
ament_cmake_export_targets : latest=2.7.5, local=2.7.5
rmw_dds_common : latest=5.0.0, local=5.0.0
laser_geometry : latest=2.10.2, local=2.10.2
rosbag2_py : latest=0.32.0, local=0.32.0
rosbag2_interfaces : latest=0.32.0, local=0.32.0
actionlib_msgs : latest=5.5.2, local=5.5.2
rviz_default_plugins : latest=15.0.12, local=15.0.12
sros2 : latest=0.15.5, local=0.15.5
rmw_fastrtps_shared_cpp : latest=9.3.4, local=9.3.3
cm_executors : latest=N/A, local=0.9.1
rosidl_core_runtime : latest=0.3.2, local=0.3.2
file_uploader : latest=N/A, local=0.0.0
ament_cmake_export_link_flags : latest=2.7.5, local=2.7.5
urdf : latest=2.12.3, local=2.12.3
tf2_msgs : latest=0.41.7, local=0.41.6
joint_state_publisher : latest=2.4.1, local=2.4.1
ros2cli_common_extensions : latest=0.4.1, local=0.4.1
ament_index_python : latest=1.11.4, local=1.11.3
rosidl_cli : latest=4.9.6, local=4.9.6
rosidl_pycommon : latest=4.9.6, local=4.9.6
rmw_connextdds_common : latest=1.1.1, local=1.1.1
ament_cmake_export_definitions : latest=2.7.5, local=2.7.5
rosbag2_compression : latest=0.32.0, local=0.32.0
builtin_interfaces : latest=2.3.2, local=2.3.1
resource_retriever : latest=3.7.1, local=3.7.1
ros2run : latest=0.38.3, local=0.38.3
xacro : latest=2.1.1, local=2.1.1
liblz4_vendor : latest=0.32.0, local=0.32.0
ament_copyright : latest=0.19.3, local=0.19.2
rcutils : latest=6.9.10, local=6.9.10
diagnostic_msgs : latest=5.5.2, local=5.5.2
rmw_implementation_cmake : latest=7.8.2, local=7.8.2
ament_lint : latest=0.19.3, local=0.19.2
rmw_test_fixture : latest=0.14.7, local=0.14.7
ament_cmake_gtest : latest=2.7.5, local=2.7.5
rosidl_default_generators : latest=1.7.2, local=1.7.2
kdl_parser : latest=2.12.1, local=2.12.1
rosidl_dynamic_typesupport : latest=0.3.1, local=0.3.1
rti_connext_dds_cmake_module : latest=1.1.1, local=1.1.1
rosbag2_storage : latest=0.32.0, local=0.32.0
ros2plugin : latest=5.6.3, local=5.6.2
hdn_msgs : latest=N/A, local=0.0.0
python_qt_binding : latest=2.3.2, local=2.3.2
occupancy_analysis : latest=N/A, local=0.0.0
rclcpp_action : latest=29.5.8, local=29.5.7
ament_cmake_auto : latest=2.7.5, local=2.7.5
tf2_ros_py : latest=0.41.7, local=0.41.6
rmw_test_fixture_implementation : latest=0.14.7, local=0.14.7
rosbag2 : latest=0.32.0, local=0.32.0
rcl_interfaces : latest=2.3.2, local=2.3.1
eigen3_cmake_module : latest=0.4.0, local=0.4.0
rcl_action : latest=10.1.4, local=10.1.4
rviz_common : latest=15.0.12, local=15.0.12
point_cloud_transport : latest=5.1.6, local=5.1.6
stereo_msgs : latest=5.5.2, local=5.5.2
gz_math_vendor : latest=0.2.7, local=0.2.6
behavior_trees : latest=N/A, local=0.0.0
libcurl_vendor : latest=3.7.1, local=3.7.1
rosidl_runtime_c : latest=4.9.6, local=4.9.6
tf2_geometry_msgs : latest=0.41.7, local=0.41.6
ros2param : latest=0.38.3, local=0.38.3
service_msgs : latest=2.3.2, local=2.3.1
tinyxml2_vendor : latest=0.10.1, local=0.10.1
urdf_parser_plugin : latest=2.12.3, local=2.12.3
rmw_fastrtps_cpp : latest=9.3.4, local=9.3.3
launch_testing : latest=3.8.7, local=3.8.7
point_cloud_accumulator_ros : latest=N/A, local=0.0.0
ament_cmake_export_include_directories : latest=2.7.5, local=2.7.5
rosidl_core_generators : latest=0.3.2, local=0.3.2
rosgraph_msgs : latest=2.3.2, local=2.3.1
ament_cmake_version : latest=2.7.5, local=2.7.5
kuka_iontec_support : latest=N/A, local=0.9.0
zstd_vendor : latest=0.32.0, local=0.32.0
rosidl_dynamic_typesupport_fastrtps : latest=0.4.2, local=0.4.2
rmw : latest=7.8.2, local=7.8.2
launch_ros : latest=0.28.5, local=0.28.5
rclcpp_lifecycle : latest=29.5.8, local=29.5.7
rosbag2_storage_mcap : latest=0.32.0, local=0.32.0
ament_cmake_flake8 : latest=0.19.3, local=0.19.2
unique_identifier_msgs : latest=2.7.0, local=2.7.0
lifecycle_msgs : latest=2.3.2, local=2.3.1
rosbag2_transport : latest=0.32.0, local=0.32.0
ros2doctor : latest=0.38.3, local=0.38.3
rosidl_typesupport_c : latest=3.3.3, local=3.3.3
type_description_interfaces : latest=2.3.2, local=2.3.1
cpp_test : latest=N/A, local=0.0.0
rviz2 : latest=15.0.12, local=15.0.12
ament_cmake_libraries : latest=2.7.5, local=2.7.5
rosidl_typesupport_introspection_cpp : latest=4.9.6, local=4.9.6
rviz_resource_interfaces : latest=15.0.12, local=15.0.12
rmw_zenoh_cpp : latest=0.6.6, local=0.6.6
py_trees_ros : latest=2.4.0, local=2.4.0
trajectory_msgs : latest=5.5.2, local=5.5.2
ament_cmake_include_directories : latest=2.7.5, local=2.7.5
sensor_msgs : latest=5.5.2, local=5.5.2
lmi_laser_ros : latest=N/A, local=0.0.0
rosbag2_cpp : latest=0.32.0, local=0.32.0
composition_interfaces : latest=2.3.2, local=2.3.1
ros_environment : latest=4.3.1, local=4.3.1
launch_xml : latest=3.8.7, local=3.8.7
rosidl_generator_py : latest=0.24.2, local=0.24.2
tf2 : latest=0.41.7, local=0.41.6
rcl_logging_spdlog : latest=3.2.4, local=3.2.4
rpyutils : latest=0.6.3, local=0.6.3
foxglove_bridge : latest=3.3.0, local=3.2.4
ros2node : latest=0.38.3, local=0.38.3
rosidl_typesupport_cpp : latest=3.3.3, local=3.3.3
ros2topic : latest=0.38.3, local=0.38.3
gz_cmake_vendor : latest=0.3.3, local=0.3.3
console_bridge_vendor : latest=1.8.0, local=1.8.0
launch_yaml : latest=3.8.7, local=3.8.7
class_loader : latest=2.8.1, local=2.8.1
libyaml_vendor : latest=1.7.1, local=1.7.1
ament_package : latest=0.17.3, local=0.17.2
statistics_msgs : latest=2.3.2, local=2.3.1
ros2interface : latest=0.38.3, local=0.38.3
rosbag2_storage_default_plugins : latest=0.32.0, local=0.32.0
rcl_logging_interface : latest=3.2.4, local=3.2.4
ros2component : latest=0.38.3, local=0.38.3
sqlite3_vendor : latest=0.32.0, local=0.32.0
ament_cmake_gmock : latest=2.7.5, local=2.7.5
ros2launch : latest=0.28.5, local=0.28.5
rmw_cyclonedds_cpp : latest=4.0.2, local=4.0.2
shape_msgs : latest=5.5.2, local=5.5.2
rosidl_typesupport_fastrtps_c : latest=3.8.2, local=3.8.2
rosidl_runtime_py : latest=0.14.2, local=0.14.1
yaml_cpp_vendor : latest=9.1.0, local=9.1.0
tf2_py : latest=0.41.7, local=0.41.6
ros2lifecycle : latest=0.38.3, local=0.38.3
launch : latest=3.8.7, local=3.8.7
ament_cmake_test : latest=2.7.5, local=2.7.5
rosbag2_manager : latest=N/A, local=0.0.0
rosidl_adapter : latest=4.9.6, local=4.9.6
geometry_msgs : latest=5.5.2, local=5.5.2

PLATFORM INFORMATION
system : Linux
platform info : XXXX
release : XXXX
processor : x86_64

QOS COMPATIBILITY LIST
topic [type] : /chatter [std_msgs/msg/String]
publisher node : _ros2cli_430932
subscriber node : _ros2cli_2576640
compatibility status : OK

RMW MIDDLEWARE
middleware name : rmw_zenoh_cpp

ROS 2 INFORMATION
distribution name : kilted
distribution type : ros2
distribution status : active
release platforms : {'debian': ['bookworm'], 'rhel': ['9'], 'ubuntu': ['noble']}

TOPIC LIST
topic : /chatter
publisher count : 0
subscriber count : 1
topic : /occupancy_analysis_marker
publisher count : 1
subscriber count : 0
topic : /occupancy_analysis_node/transition_event
publisher count : 1
subscriber count : 0
topic : /tf
publisher count : 0
subscriber count : 1
topic : /tf_static
publisher count : 0
subscriber count : 1
topic : /zed/zed_node/point_cloud/cloud_registered
publisher count : 0
subscriber count : 1

Steps to reproduce issue

I wasn't able to reproduce this with a fresh workspace. I used these steps to reproduce in a workspace that had foxglove_bridge running, and a node that subscribed to tf and a pointcloud and published a marker. Previously a ZED ros2 wrapper node had been running as well. The real failure case that triggered these steps was restarting the point cloud subscription node with the zenoh router running and not getting point cloud data on the subscription while foxglove was simultaneously viewing live data.

terminal 1

ros2 run rmw_zenoh_cpp rmw_zenohd

terminal 2 (publisher)

ros2 topic pub /chatter std_msgs/msg/String "data: hi" -r 1

wait ≥ 30 seconds

terminal 3 (subscriber)

ros2 topic echo /chatter std_msgs/msg/String --no-daemon --qos-reliability reliable

Expected behavior

Late joining subscribers should still get publisher data

Actual behavior

Late joining subscribers don't get publisher data

Additional information

Late-joining subscribers receive no data after publisher's Declare interest window — router never propagates new subscription declarations to the publisher's face

Environment

  • ROS 2 Kilted
  • rmw_zenoh_cpp 0.6.6
  • zenoh Rust API v1.7.2 (790faadb43c8f3da356bc5e3abc3ddc4d2eb41e6)
  • Config: mode: peer (clients) connecting to rmw_zenohd over TCP, multicast.enabled: false, routing.interests.timeout: 10000
  • Linux x86_64, Anthropic Claude Opus 4.7 (1M context) used during investigation

Symptom

A subscriber created more than ~10 seconds after a publisher on the same topic receives zero data, even though:

  • both endpoints declare the same data keyexpr,
  • QoS is identical (RELIABLE on both sides, RxO-compatible depths),
  • the router and both peers are alive for the duration of the subscription.

A subscriber created inside the same 10-second window receives data normally.

Repro

terminal 1

ros2 run rmw_zenoh_cpp rmw_zenohd

terminal 2 (publisher)

ros2 topic pub /chatter std_msgs/msg/String "data: hi" -r 1

wait ≥ 30 seconds

terminal 3 (subscriber)

ros2 topic echo /chatter std_msgs/msg/String --no-daemon --qos-reliability reliable

Caveat: triggering reliably appears to require additional state in the router (we reproduced consistently with a longer-running LifecycleNode, foxglove_bridge, and other peers
cycling through the router for some time; we could not reliably trigger with only topic pub + topic echo on a freshly-restarted router).

Evidence (attached zenoh router trace logs)

zenoh_start_early.txt — working baseline

┌──────────────┬───────────────────────────────────────────────────────────────────────────────────────────────┐
│ Time │ Event │
├──────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:32:21.799 │ Publisher Face{5} joins │
├──────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:32:22.497 │ Face{5} Declare interest 15 (0/chatter/.../String_) + MP token ::,10:,... (RELIABLE depth 10) │
├──────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:32:24.394 │ Subscriber Face{6} joins (+2.6 s, inside the 10 s interest window) │
├──────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:32:24.920 │ Face{6} Declare subscriber 15 (0/chatter/.../String_) + MS token ::,5:,... (RELIABLE depth 5) │
├──────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:32:33.587 │ Subscriber finishes (received data, exits) │
└──────────────┴───────────────────────────────────────────────────────────────────────────────────────────────┘

zenoh_no_daemon_sub_late_reliable.txt — failure

┌────────────────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────┐
│ Time │ Event │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:42:32.426 │ Publisher Face{5} joins │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:42:33.116 │ Face{5} Declare interest 15 + MP token ::,10:,... (RELIABLE depth 10) │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:42:55.071 │ Subscriber Face{6} joins (+22.6 s, well past 10 s interest window) │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:42:55.591 │ Face{6} Declare subscriber 15 (0/chatter/.../String_) + MS token ::,5:,... (RELIABLE depth 5) │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ No Face{5}↔Face{6} events for 75 s │ │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 21:44:10 │ Both close without any data routed │
└────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────┘

Identical data keyexpr on both sides. Identical reliability. Only difference is the subscriber arriving outside the publisher's interest window.

Hypothesis

Declare interest answers with the current matching state and then the router does not relay future matching declarations to the requester's face. Late-arriving subscribers therefore never get plumbed into a publisher's outgoing routes. The behavior matches an interest treated as current-only rather than current + future, or an interest that is implicitly closed at routing.interests.timeout.

Suspected setting

routing.interests.timeout: 10000 is the only configurable parameter that obviously matches the observed boundary. The doc comment itself warns: "The expiration of this timeout implies that the discovery protocol might be incomplete, leading to potential loss of messages, queries or liveliness tokens." We did not test whether increasing it actually changes the behaviour.

What we ruled out

  • QoS mismatch — same :: prefix in liveliness tokens on both sides; only depths differ (10 vs 5), RxO-compatible.
  • Topic-type discovery — --no-daemon + explicit type bypasses the ros2 daemon graph query path; the failure still occurs.
  • MessageFilter — reproduced with plain rclcpp::Subscription (no tf2_ros::MessageFilter in path).
  • Keyexpr difference — both subscribers register on the identical data keyexpr 0/chatter/.../String_/RIHS01_....

Two attached logs both scrubbed of private info

  • scrubbed_zenoh_start_early.txt — working baseline (early subscriber)
  • scrubbed_zenoh_no_daemon_sub_late_reliable.txt — failure (RELIABLE / RELIABLE, 22 s late)

scrubbed_zenoh_start_early.txt
scrubbed_zenoh_no_daemon_sub_late_reliable.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions