[mariadb-replication-and-ha] Missing: Flow control — one slow node stalls the entire cluster

## Summary

The skill does not mention Galera flow control. This is a defining and
frequently misunderstood cluster behavior that causes operators and agents
to misdiagnose performance problems and apply the wrong fix.

## The problem

When a Galera node falls behind in applying write sets — due to disk I/O
pressure, resource contention, or a heavy query — it sends flow control
pause messages to the rest of the cluster. All nodes pause their commit
queue until the slow node catches up.

The result: the cluster's write throughput is limited by the slowest node,
not the average. A single underprovisioned or struggling node silently
drags down the entire cluster.

Agents have no knowledge of this behavior and consistently suggest causes
like network latency or primary overload when the actual cause is a
lagging replica node.

## What agents should know

```sql
-- Check if flow control is active and its impact:
SHOW STATUS LIKE 'wsrep_flow_control_paused';
-- Values near 1.0 mean the cluster was paused nearly 100% of the time

SHOW STATUS LIKE 'wsrep_flow_control_sent';
-- Non-zero on a node means that node is sending pause signals (it is the bottleneck)

SHOW STATUS LIKE 'wsrep_local_recv_queue_avg';
-- High values on a node indicate it cannot keep up with incoming write sets
```

## Suggested addition

A flow control subsection covering:
- What flow control is and what triggers it
- The implication that cluster throughput is bounded by the slowest node
- `wsrep_flow_control_paused` and `wsrep_flow_control_sent` as
  diagnostic metrics
- How to identify which node is the source of flow control pressure

---

Reviewed from an operations perspective by Jana Korapala, Founder and CEO,
DBaasNow. DBaasNow is a platform agnostic database lifecycle control plane
adding MariaDB to its catalog in Q3 2026.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mariadb-replication-and-ha] Missing: Flow control — one slow node stalls the entire cluster #18

Summary

The problem

What agents should know

Suggested addition

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[mariadb-replication-and-ha] Missing: Flow control — one slow node stalls the entire cluster #18

Description

Summary

The problem

What agents should know

Suggested addition

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions