You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Reliable send failed, returning a nak ..."src/mesh/NextHopRouter.cpp:282
"Resetting next hop ..."src/mesh/NextHopRouter.cpp:300
Observability gap
I don’t see a dedicated metric for router RX overflow drops.
Current local stats include duplicates/canceled-relays, but not router RX queue overflow:
num_rx_dupe / num_tx_relay_canceled exported in src/modules/Telemetry/DeviceTelemetry.cpp:140, src/modules/Telemetry/DeviceTelemetry.cpp:141
Additional note
Protocol docs/comments emphasize TX prioritization for ACK/routing (src/mesh/generated/meshtastic/mesh.pb.h:518), but this RX queue has no analogous prioritization and can discard routing control packets under burst load.
Discussion questions
Is dropping oldest intentional for this queue, or should this be revisited for reliability-sensitive traffic?
Should ACK/routing frames get protected treatment on RX (priority lane or protected subqueue)?
Should we add an explicit router_rx_overflow_drops counter to telemetry/logging?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
My ai thinks packets could be dropped. Is this worth pursuing?
Core issue
The router RX queue depth is fixed at 4:
MAX_RX_FROMRADIO = 4insrc/mesh/Router.cpp:25Overflow behavior in
Router::enqueueReceivedMessage()is:Code:
src/mesh/Router.cpp:151src/mesh/Router.cpp:154src/mesh/Router.cpp:156src/mesh/Router.cpp:158So overflow is not backpressure or drop-newest; it is explicit drop-oldest.
Why this can drop routing control traffic
All inbound LoRa packets are fed to router in promiscuous mode (before packet-class/port handling):
src/mesh/RadioLibInterface.cpp:482src/mesh/RadioLibInterface.cpp:513src/mesh/RadioInterface.cpp:955Other transports also enqueue into the same queue:
src/mqtt/MQTT.cpp:142,src/mqtt/MQTT.cpp:145src/mesh/udp/UdpMulticastHandler.h:84There is no type/priority classification at enqueue time, so routing control packets and normal payload packets are treated identically.
Routing logic (sniff/ACK handling) only runs later, after dequeue + decode + module dispatch:
src/mesh/Router.cpp:708src/mesh/Router.cpp:755src/modules/RoutingModule.cpp:30,src/modules/RoutingModule.cpp:31If packet is dropped at enqueue overflow, it never reaches routing logic.
Why overflow is plausible (not theoretical)
Per-packet processing on dequeue is non-trivial:
src/mesh/Router.cpp:703)src/mesh/Router.cpp:468)src/mesh/Router.cpp:755)Ingress can be bursty:
src/mesh/RadioLibInterface.cpp:482)src/mqtt/MQTT.cpp:142,src/mesh/udp/UdpMulticastHandler.h:84)With queue depth 4, short bursts can evict early packets.
How this manifests in behavior
1) Missed ACK/NAK processing -> unnecessary retransmissions -> MAX_RETRANSMIT failures
If ACK/NAK packet is dropped before routing module sees it:
src/mesh/ReliableRouter.cpp:147tosrc/mesh/ReliableRouter.cpp:156)src/mesh/NextHopRouter.cpp:290)src/mesh/NextHopRouter.cpp:282tosrc/mesh/NextHopRouter.cpp:285)User-facing symptom: “message failed / retries despite good RF conditions”.
2) Lost ACK/reply observations -> degraded next-hop learning and extra flooding
Next-hop learning is updated from ACK/reply observations in:
src/mesh/NextHopRouter.cpp:90tosrc/mesh/NextHopRouter.cpp:109If those packets are dropped at RX queue overflow:
src/mesh/NextHopRouter.cpp:295tosrc/mesh/NextHopRouter.cpp:303)Symptom: elevated airtime and route instability even with stable topology.
3) Missed inbound
want_ackhandling at destinationIf inbound
want_ackpacket to us is dropped:ReliableRouter::sniffReceived()pathRelevant ACK generation path:
src/mesh/ReliableRouter.cpp:96tosrc/mesh/ReliableRouter.cpp:130Logging signatures to watch
Overflow event:
"fromRadioQ full, drop oldest!"fromsrc/mesh/Router.cpp:158Likely downstream effects:
"Sending retransmission ..."src/mesh/NextHopRouter.cpp:290"Reliable send failed, returning a nak ..."src/mesh/NextHopRouter.cpp:282"Resetting next hop ..."src/mesh/NextHopRouter.cpp:300Observability gap
I don’t see a dedicated metric for router RX overflow drops.
Current local stats include duplicates/canceled-relays, but not router RX queue overflow:
num_rx_dupe/num_tx_relay_canceledexported insrc/modules/Telemetry/DeviceTelemetry.cpp:140,src/modules/Telemetry/DeviceTelemetry.cpp:141Additional note
Protocol docs/comments emphasize TX prioritization for ACK/routing (
src/mesh/generated/meshtastic/mesh.pb.h:518), but this RX queue has no analogous prioritization and can discard routing control packets under burst load.Discussion questions
router_rx_overflow_dropscounter to telemetry/logging?Beta Was this translation helpful? Give feedback.
All reactions