SnapCap TCP Connection Reliability Improvements#2381
Conversation
|
Thanks. This would affect any INDI driver, not just SnapCap. What would lead to this exactly? WiFi/Network Controller drops the connection or what? |
|
The driver is for a home-made SnapCap dust cover. The cover is implemented using a ESP8266 and the connection is by WiFi. From time to time the ESP8266 disconnects from the network or reboots. When it reboots, or disconnects, the connection is lost forever, the driver does not try it again. If you have automated a session, the session will not succeed, since the cap will never open nor close when required. |
|
hi again @knro does this pr need refinement? will you please advice on what the next step is? thank you in advance 👍 |
|
Sorry for the delay. I don't think it's the right place to do it at the driver level. This issue should be handled at the plugin level for TCP/UDP connections and then delegated to the concrete drivers to make a decision on how to reconnect..etc. Let me think more about it since this change needs to be at INDI base level. |
|
Thank you! I am planning to get involved in solving the TCP reconnect issue, how can I get involved? In fact, I have been working also on a solution at the Connection::TCP class (for the Dome interface), but I think it might require more discussion. I will share it in another PR and let's start discussing about it there. |
|
Exactly, it needs to happen at the Connection::TCP class so that all drivers benefit from this. |
|
I will open a new PR with my current approach. |
|
Here it is #2390 |
SnapCap TCP Connection Reliability Improvements
Problem Statement
SnapCap communication can degrade when:
Without recovery logic, manual disconnect/reconnect is required from the client.
Proposed Design
The driver now uses a reconnect abstraction:
SnapCap::ReconnectInterfaceindrivers/auxiliary/snapcap.hSnapCapReconnectimplementation indrivers/auxiliary/snapcap.cppReconnect is non-blocking and driven from
TimerHit(), so command handlers do not sleep-loop while recovering.Configurable policy from INDI UI (
RECONNECT_POLICY)CONNECTION_TABas propertyRECONNECT_POLICY.FAILURES_BEFORE_RECONNECTshows as Max FailuresDELAY_MSshows as Base delay (ms)FAILURES_BEFORE_RECONNECT(MIN_FAILURES_BEFORE_RECONNECT..MAX_FAILURES_BEFORE_RECONNECT, default2)DELAY_MS(MIN_RECONNECT_DELAY_MS..MAX_RECONNECT_DELAY_SETTING_MS, default500)ISNewNumber()and persisted withsaveConfig(ReconnectNP).Exponential backoff with cap
computeReconnectDelayMs(attemptCount).DELAY_MS, doubles by attempt, capped byMAX_RECONNECT_DELAY_MS(4000 ms).2x,4x,8x, ... ofDELAY_MS(with cap).Future work
Use this same strategy for other components that communicate over TCP, like the
rolloffino.