Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ vtctldclient EmergencyReparentShard <keyspace/shard>
### Options

```
--allow-split-brain-promotion Allow ERS to proceed when two leading candidates have incomparable Combined GTID positions (suspected split-brain). Off by default. Operator escape hatch — accepts that the losing side's unique GTIDs will become errant.
--expected-primary string Alias of a tablet that must be the current primary in order for the reparent to be processed.
-h, --help help for EmergencyReparentShard
-i, --ignore-replicas strings Comma-separated, repeated list of replica tablet aliases to ignore during the emergency reparent.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,21 @@ This command performs the following actions:
- On the primary-elect tablet, insert a row in the `reparent_journal` table and then updates the `PrimaryAlias` property of the global shard object.
- In parallel on each replica, excluding the old primary, set the new primary as the replication source and wait for the inserted row to replicate to the replica tablets.

#### Split-brain detection

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added split-brain detection and --allow-split-brain-promotion flag documentation based on PR #18707 which introduces upfront split-brain detection in filterAndCheckUniform() and the operator escape hatch flag.

Source: vitessio/vitess#18707


For GTID-based shards, ERS detects suspected split-brain scenarios by comparing the `Combined` (relay log) GTID positions of leading candidates. When two or more candidates have incomparable GTID positions—meaning neither tablet's position is a superset of the other—ERS aborts with a `FAILED_PRECONDITION` error naming the diverged tablets. This prevents silently promoting one side while leaving the other side's unique GTIDs to become errant.

If you encounter this error and know which side to keep, use the `--allow-split-brain-promotion` flag to proceed. This converts the abort into a warning and allows ERS to continue. The non-promoted side's unique GTIDs will become errant after promotion. When using this flag:

- Use `--new-primary` to specify which tablet to promote, ensuring deterministic side selection.
- Consider using `--ignore-replicas` to exclude tablets on the side you want to discard from the candidate pool.

#### Partial relay-log-apply tolerance

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added partial relay-log-apply tolerance documentation based on PR #18707 which implements the waitForAllRelayLogsToApply() short-circuit behavior for GTID-based shards.

Source: vitessio/vitess#18707

For GTID-based shards, ERS tolerates partial relay-log-apply failures. As long as at least one tablet at the leading `Combined` GTID position successfully applies its relay logs, ERS proceeds. Lagging or stuck replicas no longer block the operation, making ERS more resilient in degraded scenarios.

For non-GTID flavors (FilePos, MariaDB), the previous behavior is preserved: ERS requires every candidate to successfully apply relay logs.

### Metrics

Metrics are available on the `/debug/vars` pages of VTOrc and VTCtld for the reparent operations that they execute. Corresponding Prometheus-compatible metrics are available at `/metrics`.
Expand All @@ -91,6 +106,9 @@ Metrics are available on the `/debug/vars` pages of VTOrc and VTCtld for the rep
| `planned_reparent_counts` | Number of times PlannedReparentShard has been run. Available dimensions are keyspace, shard and the result of the operation. |
| `emergency_reparent_counts` | Number of times EmergencyReparentShard has been run. Available dimensions are keyspace, shard and the result of the operation. |
| `reparent_shard_operation_timings` | Timings of reparent shard operations indexed by the type of operation. |
| `EmergencyReparentFilteredCandidates` | Number of candidates excluded from the relay-log wait during ERS because their `Combined` position was behind the leading group. Keyed by keyspace and shard. |

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added three new metrics (EmergencyReparentFilteredCandidates, EmergencyReparentRelayLogFailedCandidates, EmergencyReparentSplitBrainOverrides) based on PR #18707 where they are defined in go/vt/vtctl/reparentutil/emergency_reparenter.go.

Source: vitessio/vitess#18707

| `EmergencyReparentRelayLogFailedCandidates` | Number of candidates that failed to apply relay logs during ERS. Keyed by keyspace and shard. |
| `EmergencyReparentSplitBrainOverrides` | Number of split-brain detections bypassed by `--allow-split-brain-promotion` during ERS. Keyed by keyspace and shard. Stays at zero unless an operator has deliberately invoked the escape hatch. |

## External Reparenting

Expand Down