Automatic open #7003
cjen1-msft
started this conversation in
Design
Automatic open
#7003
Replies: 1 comment 2 replies
-
This scheme still has a possible fork. This all assumes no manual-dr by the operators takes place, that node identities are guaranteed to be unique, and that no logs are truncated. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Extending the discussion from #6985 another automation direction is to add the capacity for a CCF network to automatically recover from a disaster.
The aim is to allow the system to have an automated process to pick the 'best' recovering node when mulitple recover, and to then have all others to try to join that node using the auto-join protocol described in #6985.
Assumptions
There is an external scheduler which can restart nodes, and this scheduler restarts the nodes on the same hardware as previously, allowing for local unsealing.
This scheduler is relatively simple and just restarts the cluster when a sufficient time has passed without the network being live.
Protocol
For a more precise specification of this protocol see the tla and stateright in modelling-autoopen branch of cjen1-msft's ccf fork.
When nodes restart they try to
Join
the previous active network.If they are unsuccessful they switch to
Recovering
and perform the following protocol.Nodes periodically broadcast the highest transaction id (
txid
) in their ledger, effectively gossiping the state of their ledgers.When a node has heard from enough each of the other nodes it expects to be in its cluster, or a timeout has fired; it sends a vote to the node who had the highest
txid
, with ties broken by the node'd identity.If any node receives votes from a majority of nodes it notifies all other nodes that it is transitioning to
Open
, and then transitions toOpen
.The other nodes, receiving this restart themselves using a temporary
Join
configuration file which points to the opened node's network.Successful path
Deadlock cases
So long as a majority of nodes vote for the same node, the system will open using that node, and not deadlock.
This is equivalent to saying that even if nodes are dead, hence requiring the timeout must trigger, that a majority of the nodes gossiping among themselves before the timeout triggered will prevent deadlock.
However if the timeout triggers before the nodes have sufficiently gossiped, the network could deadlock as shown below.
Alternatives
The key tradeoff here is between the risk of a deadlock, no open networks, and a fork, multiple open networks.
There are several other points in this tradeoff space.
One alternative is to allow nodes to vote for every node which has a higher txid in their ledger.
This will generally result in multiple nodes opening their networks.
Another alternative is to use several rounds of communication to try to reach consensus on the recovering node (similarly to one of the standard consensus primitives).
Although, by virtue of using a consensus primitive, this approach can avoid deadlock and ensure a single open network, the delay before recovery is only probabilistically bounded (livelock vs deadlock), and the implementation complexity is higher.
Beta Was this translation helpful? Give feedback.
All reactions