Skip to content

Conversation

@estellesoulard
Copy link
Contributor

@estellesoulard estellesoulard commented Aug 20, 2025

Perform the consul leave on secondaries before leader .
Ensure the cluster failure tolerance is sufficient before proceeding with the next node.

SUMMARY
  • The current rolling restart runs on host in the inventory order, regardless of the leader status. The consul documentation recommends running rolling restarts on the leader last, for a good reason: if we restart the leader first, we then cause 2 switchovers in fast succession.
    -> I added a check on consul leadership and a reordering of hosts before the rolling restart. This change aims at being non invasive: if there is no leader, the initial host order will be used.

  • The current rolling restarts waits for consul info to respond. However, this does not signify that the node has properly rejoined the cluster as an active voter, and if we proceed with the next consul leave in such a state, we will have an unbalanced cluster.
    -> I replaced the consul info by a check on 'Failure Tolerance' status which should be >=1 before proceeding.

Perform the consul leave on secondaries before leader
Ensure the cluster failure tolerance is sufficient before proceeding
with the next node
@estellesoulard estellesoulard changed the title Better leave_restart_consul [DRAFT] Better leave_restart_consul Aug 22, 2025
@estellesoulard estellesoulard force-pushed the master branch 2 times, most recently from a1b187f to 4a11b2e Compare August 22, 2025 12:13
@estellesoulard estellesoulard changed the title [DRAFT] Better leave_restart_consul Better leave_restart_consul Aug 25, 2025
@bastienwirtz
Copy link
Contributor

Nice improvement of the rolling restart 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants