Allow graceful termination of ObjectDiffusion from the client side -> make sure it regains agency frequently enough

There is currently a couple of issues with the current design of ObjectDiffusion w.r.t. miniprotocol technical requirements, as identified in the meeting with the network team on 2025/12/17 (cc @coot @crocodile-dentist):

**1**. There is no way for the client side to terminate gracefully if it is blocked on waiting an answer to a blocking request for ID. Indeed, the server as agency in this situation, but is not allowed to answer with an empty list of objects. The only way is to wait for a "long timeout"
    - But it is a "requirement" for a node to have a way to terminate gracefully its client-side miniprotocols at will within ~30s, which would suggest "long timeout < 30s"
    - But due to the parametrization of Peras, the duration of a round, etc, for certificate diffusion, with Peras in its "happy path", the server may not have any new object available within 90s, which suggests "long timeout > 90s" (otherwise a blocking request will almost always fail with timeout)

The design proposed in https://github.com/tweag/cardano-peras/issues/144 would not help here, since the client would still be blocking (in state `StObjIdsMustReplay`) on receiving an answer from the server for a long (~ 90s) span of time.

**2**. If a Peras cooldown happens, then the ObjectDiffusion miniprotocols (for votes and certs) will fail ungracefully at some point since the consecutive Peras cooldown duration is not really bound and could last several minutes/hours. It would be better, when a cooldown is detected, to have a way to terminate the peras miniprotocols gracefully (and re-initiate them later)

**Potential solutions for 1:**

**S1-A**. Make the server answer with "keep-alive" objects every X seconds, so that we can use a relatively short timeout even for blocking requests. But this would consume extra network bandwith (this is my "naive" potential solution)

**S1-B**. Do not use a blocking ID request, but instead do a simple polling system based on non-blocking requests (pseudo blocking request = retry maximum $n$ times a non-blocking request, with a $smallDelay$ in between each, and if we still haven't received anything, sleep for $longDelay$ and repeat) (this has been proposed by the network team)

The goals of both solutions is to make sure the client regain agency frequently enough, as it is the only one who can trigger a graceful termination with `MsgDone`. Both solutions could also impact heavily the caught-up detection we want to implement in https://github.com/tweag/cardano-peras/issues/144

**Potential solutions for 2:**
Easier for this one I think; at least in theory. The component in charge of deciding if we vote or not should have a way to inform the network layer that we need to terminate these protocols because of a cooldown. I'm not exactly sure how the restart would work though

In any case, this seems to require changes that depart significantly from the [Peras design document](https://tweag.github.io/cardano-peras/peras-design.pdf). Typically, making a non-blocking request for IDs when we don't currently have unacknowledged ones is currently a protocol violation.

> [!NOTE] 
> This has been detected while trying to unblock https://github.com/IntersectMBO/ouroboros-network/pull/5267, but it isn't clear if solving this issue will completely unblock https://github.com/IntersectMBO/ouroboros-network/pull/5267

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow graceful termination of ObjectDiffusion from the client side -> make sure it regains agency frequently enough #187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow graceful termination of ObjectDiffusion from the client side -> make sure it regains agency frequently enough #187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions