Catch up joining nodes in the PASSIVE state before promotion to ACTIVE #302
Description
Copycat servers that join the cluster as ACTIVE
members are currently immediately added to the configuration in the ACTIVE
state. This can cause a loss of availability in certain cases while the new node is being caught up to the leader. For example, if a second node is added to a one node cluster, if the new node immediately becomes a voting member, new writes to the cluster will be blocked until the new node is caught up. For this reason, nodes are typically added in a PROMOTABLE
state wherein they are caught up before they are promoted to a full voting member of the cluster.
To be done correctly, this should be done by adding an additional PROMOTABLE
node type/state to the Copycat cluster. PASSIVE
nodes don't necessarily fit this role well since they receive only committed entries from followers. PROMOTABLE
nodes should run in the passive state and receive all entries from the leader. Once a PROMOTABLE
node has caught up to the leader's commitIndex
, either the leader should promote it or it should promote itself via the leader.