Skip to content

Commit 62f5028

Browse files
committed
A113: pick_first: Weighted Random Shuffling
1 parent d926ac4 commit 62f5028

File tree

1 file changed

+80
-0
lines changed

1 file changed

+80
-0
lines changed
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
A113: pick_first: Weighted Random Shuffling
2+
----
3+
* Author(s): Alex Polcyn (@apolcyn)
4+
* Approver: Mark Roth (@markdroth), Eric Anderson (@ejona86), Doug Fawley (@dfawley), Easwar Swaminathan (@easwars)
5+
* Status: Draft
6+
* Implemented in: <language, ...>
7+
* Last updated: Jan 26, 2026
8+
* Discussion at: <google group thread> (filled after thread exists)
9+
10+
## Abstract
11+
12+
Support weighted random shuffling in the pick first LB policy.
13+
14+
## Background
15+
16+
The pick first LB policy currently supports random shuffling. A primary intention of the feature
17+
is for load balancing, however it does not take (possibly present) locality or endpoint weights
18+
into account. Naturally this can lead to skewed load distribution and hotspots, when the load
19+
balancing control plane delivers varied weights and expects them to be followed.
20+
21+
22+
### Related Proposals:
23+
* [A62](https://github.com/grpc/proposal/blob/master/A62-pick-first.md): pick_first: sticky TRANSIENT_FAILURE and address order randomization
24+
* [A42](https://github.com/grpc/proposal/blob/master/A42-xds-ring-hash-lb-policy.md) xDS Ring Hash LB Policy
25+
26+
## Proposal
27+
28+
### Changes within Pick First
29+
30+
Modify behavior of pick_first when the `shuffle_address_list` option is set, and
31+
perform a weighted random sort *based on per-endpoint weights*:
32+
* Use the [Weighted Random Sampling](https://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf) algorithm
33+
proposed by Efraimidis, Spirakis.
34+
* Set the weight of each endpoint to `u ^ (1 / weight)`, where `u` is a uniform random number in `(0, 1)` and weight
35+
is the weight of the endpoint (as present in a weight attribute). Default to 1 if no weight attribute is present.
36+
37+
### CDS LB Policy changes: Computing Endpoint Weights
38+
39+
In XDS, we have a notion of both locality and endpoint weights. The expectation of the load balancing
40+
control plane is to *first* pick locality and *second* pick endpoint. The total probability distribution
41+
reflected by per-endpoint weights must reflect this. As such, we need to normalize locality weights within
42+
each priority and endpoint weights within locality; the final weight provided to `pick_first` should be a
43+
product of the two normalized weights (i.e. a logical AND of the two selection events).
44+
45+
The CDS LB policy currently calculates per-endpoint weight attributes. It will continue to do so however
46+
we need to fix the mechanics: an endpoint's final weight should be a product of its *normalized* locality
47+
weight and *normalized* endpoint weight, rather than their product outright. Note: as a side effect this
48+
will fix per-endpoint weights in Ring Hash LB, which
49+
[currently](https://github.com/grpc/proposal/blob/master/A42-xds-ring-hash-lb-policy.md) multiply
50+
*raw* locality and endpoint weights.
51+
52+
We can continue to represent weights as integers if we represent their normalized values in
53+
fixed point Q31 format. Math as follows (citation due for @ejona):
54+
55+
```
56+
// To normalize:
57+
uint32_t ONE = 1 << 31;
58+
uint32_t weight = (uint64_t) weight * ONE / weight_sum;
59+
60+
// To multiply the weights for an endpoint:
61+
weight = ((uint64_t) locality_weight * weight) >> 31;
62+
if (weight == 0) weight = 1;
63+
```
64+
65+
### Temporary environment variable protection
66+
67+
CDS LB policy and Pick First LB policy behavior changes will be guarded by `GRPC_EXPERIMENTAL_PF_WEIGHTED_SHUFFLING`.
68+
69+
## Rationale
70+
71+
* CDS LB policy changes are needed to generate correct weight distributions, not only for Pick First but
72+
also for Ring Hash
73+
* Using fixed point Q31 format has predictable bounds on precision, and allows us to continue representing
74+
weights as integers. Note our math assumes the sum of weights within a grouping does not exceed max uint32,
75+
which is mandated in the XDS protocol.
76+
77+
## Implementation
78+
79+
TBD
80+

0 commit comments

Comments
 (0)