Skip to content

Commit 6429cce

Browse files
committed
add information about random vtgate load balancer
1 parent 885767d commit 6429cce

File tree

2 files changed

+71
-17
lines changed

2 files changed

+71
-17
lines changed

content/en/docs/24.0/reference/features/tablet-balancer.md

Lines changed: 67 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,13 @@ In many cases this approach suffices, since if there are a proportional number o
2626
satisfy the inbound traffic to the vtgates in that cell, then in general the queries will be distributed evenly to
2727
each tablet.
2828

29-
## Balancer Motivation
29+
## Balancer Modes
3030

3131
However, in some topologies, a simple affinity algorithm does not effectively balance the load.
32+
VTGate provides two additional balancer modes to address different topology and traffic patterns:
33+
**prefer-cell** and **random**.
34+
35+
### When Default Policy is Insufficient
3236

3337
As a simple example:
3438

@@ -49,41 +53,90 @@ cell will only receive 1/6 of the queries.
4953
Other topologies that can cause similar pathologies include cases where there may be cells
5054
containing replicas but no local vtgates, and/or cells that have only vtgates but no replicas.
5155

52-
For these topologies, the tabletBalancer proportionally assigns the output flow to each tablet,
53-
preferring the local cell where possible, but only as long as the global query balance is
54-
maintained.
56+
### Prefer-Cell Balancer
5557

56-
## Algorithm
58+
The prefer-cell balancer addresses topologies where tablets and vtgates are unevenly distributed,
59+
but traffic is relatively balanced across vtgate cells. It proportionally assigns the output
60+
flow to each tablet, preferring the local cell where possible, but only as long as the global
61+
query balance is maintained.
5762

58-
To accomplish this goal, the balancer is given:
63+
To accomplish this goal, the prefer-cell balancer is given:
5964

6065
* The list of cells that receive inbound traffic to vtgates (from configuration)
6166
* The local cell where the vtgate exists (from configuration)
6267
* The set of tablets and their cells (learned from discovery)
6368

64-
The model assumes there is an equal probablility of a query coming from each vtgate cell, i.e.
69+
The model assumes there is an equal probability of a query coming from each vtgate cell, i.e.
6570
traffic is effectively load balanced between the cells with vtgates.
6671

6772
Given that information, the balancer builds a simple model to determine how much query load
6873
would go to each tablet if vtgate only routed to its local cell. Then if any tablets are
6974
unbalanced, it shifts the desired allocation away from the local cell preference in order to
7075
even out the query load.
7176

72-
Based on this global model, the vtgate then probabalistically picks a destination for each
77+
Based on this global model, the vtgate then probabilistically picks a destination for each
7378
query to be sent and uses these weights to order the available tablets accordingly.
7479

7580
Assuming each vtgate is configured with and discovers the same information about the topology,
7681
and the input flow is balanced across the vtgate cells (as mentioned above), then each vtgate
77-
should come the the same conclusion about the global flows, and cooperatively should
82+
should come to the same conclusion about the global flows, and cooperatively should
7883
converge on the desired balanced query load.
7984

85+
**When to use prefer-cell mode:**
86+
* Tablets and vtgates are distributed unevenly across cells
87+
* Traffic is relatively balanced across all vtgate cells
88+
* You want to maintain cell affinity where possible to minimize latency
89+
90+
### Random Balancer
91+
92+
The random balancer addresses a different scenario: when application traffic is concentrated
93+
in fewer cells than where database replicas exist. Unlike the prefer-cell balancer, which
94+
assumes equal traffic distribution across vtgate cells, the random balancer makes no assumptions
95+
about traffic patterns.
96+
97+
The random balancer selects tablets with uniform probability (1/N for N available tablets),
98+
completely ignoring cell affinity. This trades off potential latency optimization for a
99+
guaranteed even load distribution across all tablets, regardless of where traffic originates.
100+
101+
**When to use random mode:**
102+
* Application traffic is highly concentrated in specific cells (e.g., 90% in one cell, 10% in another)
103+
* Cross-cell/cross-zone latency is acceptable for your workload
104+
* Avoiding tablet hotspots is more important than minimizing query latency
105+
* You have a single-AZ application deployment with multi-AZ database replicas
106+
107+
**Example scenario:**
108+
109+
```
110+
Cell A: 90% --> vtgates --> randomly select from all tablets (1/4 each)
111+
Cell B: 10% --> vtgates --> randomly select from all tablets (1/4 each)
112+
113+
Result: All 4 tablets receive ~25% of total load, regardless of cell
114+
```
115+
116+
With the random balancer, you can optionally use `--balancer-vtgate-cells` to restrict the
117+
tablet pool to specific cells, but it's not required.
118+
80119
## Configuration
81120

82-
To enable the balancer requires the following configuration:
121+
VTGate provides three balancer modes, controlled by the `--vtgate-balancer-mode` flag:
122+
123+
### Balancer Mode Selection
124+
125+
* **`--vtgate-balancer-mode=cell`** (default): Uses local cell affinity random choice (default policy described above)
126+
* **`--vtgate-balancer-mode=prefer-cell`**: Uses the prefer-cell balancer algorithm
127+
* **`--vtgate-balancer-mode=random`**: Uses uniform random selection across all tablets
128+
129+
### Configuration Flags
130+
131+
* **`--vtgate-balancer-mode`**: Specifies which balancer mode to use (cell, prefer-cell, or random). Defaults to `cell`.
132+
133+
* **`--balancer-vtgate-cells`**: Comma-separated list of cells that contain vtgates.
134+
* **Required** for `prefer-cell` mode
135+
* **Optional** for `random` mode (filters tablets to specified cells if provided)
136+
* Ignored for `cell` mode
83137

84-
* `--enable-balancer`: Enables the balancer. **Not enabled by default**
85-
* `--balancer-vtgate-cells`: Specifies the set of cells that contain vtgates
138+
* **`--balancer-keyspaces`**: Comma-separated list of keyspaces for which to use the configured balancer mode. If empty, applies to all keyspaces. This allows gradual rollout of balancer modes.
86139

87-
Optionally this behavior can be restricted only when routing to certain keyspaces as a means of controlling rollout:
140+
### Deprecated Flag
88141

89-
* `--balancer-keyspaces`: Specifies the set of keyspaces for which the balancer should be enabled.
142+
* **`--enable-balancer`**: **(DEPRECATED)** This flag has been replaced by `--vtgate-balancer-mode=prefer-cell`. While still accepted for backwards compatibility, it will be removed in a future release. If you are currently using `--enable-balancer`, migrate to using `--vtgate-balancer-mode=prefer-cell` instead.

content/en/docs/24.0/reference/programs/vtgate/_index.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,8 @@ vtgate \
4444
--allow-kill-statement Allows the execution of kill statement
4545
--allowed-tablet-types strings Specifies the tablet types this vtgate is allowed to route queries to. Should be provided as a comma-separated set of tablet types.
4646
--alsologtostderr log to standard error as well as files
47-
--balancer-keyspaces strings When in balanced mode, a comma-separated list of keyspaces for which to use the balancer (optional)
48-
--balancer-vtgate-cells strings When in balanced mode, a comma-separated list of cells that contain vtgates (required)
47+
--balancer-keyspaces strings Comma-separated list of keyspaces for which to use the balancer (optional). If empty, applies to all keyspaces.
48+
--balancer-vtgate-cells strings Comma-separated list of cells that contain vttablets. For 'prefer-cell' mode, this is required. For 'random' mode, this is optional and filters tablets to those cells.
4949
--bind-address string Bind address for the server. If empty, the server will listen on all available unicast and anycast IP addresses of the local system.
5050
--buffer-drain-concurrency int Maximum number of requests retried simultaneously. More concurrency will increase the load on the PRIMARY vttablet when draining the buffer. (default 1)
5151
--buffer-keyspace-shards string If not empty, limit buffering to these entries (comma separated). Entry format: keyspace or keyspace/shard. Requires --enable-buffer=true.
@@ -73,7 +73,7 @@ vtgate \
7373
--discovery-high-replication-lag-minimum-serving duration Threshold above which replication lag is considered too high when applying the min_number_serving_vttablets flag. (default 2h0m0s)
7474
--discovery-low-replication-lag duration Threshold below which replication lag is considered low enough to be healthy. (default 30s)
7575
--emit-stats If set, emit stats to push-based monitoring and stats backends
76-
--enable-balancer Enable the tablet balancer to evenly spread query load for a given tablet type
76+
--enable-balancer (DEPRECATED: use --vtgate-balancer-mode instead) Enable the tablet balancer to evenly spread query load for a given tablet type
7777
--enable-buffer Enable buffering (stalling) of primary traffic during failovers.
7878
--enable-buffer-dry-run Detect and log failover events, but do not actually buffer requests.
7979
--enable-direct-ddl Allow users to submit direct DDL statements (default true)
@@ -263,6 +263,7 @@ vtgate \
263263
-v, --version print binary version
264264
--vmodule vModuleFlag comma-separated list of pattern=N settings for file-filtered logging
265265
--vschema-ddl-authorized-users string List of users authorized to execute vschema ddl operations, or '%' to allow all users.
266+
--vtgate-balancer-mode string Tablet balancer mode (options: cell, prefer-cell, random). Defaults to 'cell' which shuffles tablets in the local cell.
266267
--vtgate-config-terse-errors prevent bind vars from escaping in returned errors
267268
--warming-reads-concurrency int Number of concurrent warming reads allowed (default 500)
268269
--warming-reads-percent int Percentage of reads on the primary to forward to replicas. Useful for keeping buffer pools warm

0 commit comments

Comments
 (0)