Skip to content

Commit facc4d3

Browse files
committed
A113: pick_first: Weighted Random Shuffling
1 parent d926ac4 commit facc4d3

File tree

1 file changed

+70
-0
lines changed

1 file changed

+70
-0
lines changed
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
A113: pick_first: Weighted Random Shuffling
2+
----
3+
* Author(s): Alex Polcyn (@apolcyn)
4+
* Approver: Mark Roth (@markdroth), Eric Anderson (@ejona86), Doug Fawley (@dfawley), Easwar Swaminathan (@easwars)
5+
* Status: Draft
6+
* Implemented in: <language, ...>
7+
* Last updated: Jan 26, 2026
8+
* Discussion at: <google group thread> (filled after thread exists)
9+
10+
## Abstract
11+
12+
Support weighted random shuffling in the pick first LB policy.
13+
14+
## Background
15+
16+
The pick first LB policy currently supports random shuffling. A primary intention of the feature
17+
is for load balancing, however it does not take (possibly present) locality or endpoint weights
18+
into account. Naturally this can lead to skewed load distribution and hotspots, when the load
19+
balancing control plane delivers varied weights and expects them to be followed.
20+
21+
22+
### Related Proposals:
23+
* [A62](https://github.com/grpc/proposal/blob/master/A62-pick-first.md): pick_first: sticky TRANSIENT_FAILURE and address order randomization
24+
* [A42](https://github.com/grpc/proposal/blob/master/A42-xds-ring-hash-lb-policy.md) xDS Ring Hash LB Policy
25+
26+
## Proposal
27+
28+
Modify behavior of pick_first when the `shuffle_address_list` option is set, and
29+
perform a weighted random sort *based on per-endpoint weights*:
30+
* Use the [Weighted Random Sampling](https://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf) algorithm
31+
proposed by Efraimidis, Spirakis.
32+
* Set the weight of each endpoint to `u ^ (1 / weight)`, where `u` is a uniform random number in `(0, 1)` and weight
33+
is the weight of the endpoint (as present in a weight attribute). Default to 1 if no weight attribute is present.
34+
35+
Note in XDS, we have a notion of both locality and endpoint weights. The expectation of the load balancing
36+
control plane is to *first* pick locality and *second* pick endpoint. The total probability distribution
37+
reflected by per-endpoint weights must reflect this. As such, we need to normalize locality weights within
38+
each priority and endpoint weights within locality; the final weight provided to `pick_first` should be a
39+
product of the two normalized weights (i.e. a logical AND of the two selection events).
40+
41+
The CDS LB policy currently calculates per-endpoint weight attributes. We can continue with this however
42+
we need to modify CDS LB to compute the final per-endpoint weights as a product of normalized locality
43+
and endpoint weights rather than their product outright. Note: as a side effect this will fix per-endpoint
44+
weights in Ring Hash LB.
45+
46+
We can continue to represent weights as integers if we represent their normalized values in
47+
fixed point Q31 format (citation due for @ejona):
48+
49+
1) Normalize a weight within a `weight_sum` as follows: `uint32_t normalized = ((uint64_t)weight * 2 ^ 31) / weight_sum`.
50+
51+
2) Multiply two normalized weights as follows: `weight = ((uint64_t) weight1 * weight2) >> 31`
52+
53+
3) Zero weights should be rounded up to 1.
54+
55+
### Temporary environment variable protection
56+
57+
CDS LB policy and Pick First LB policy behavior changes will be guarded by `GRPC_EXPERIMENTAL_PF_WEIGHTED_SHUFFLING`.
58+
59+
## Rationale
60+
61+
* CDS LB policy changes are needed to generate correct weight distributions, not only for Pick First but
62+
also for Ring Hash
63+
* Using fixed point Q31 format has predictable bounds on precision, and allows us to continue representing
64+
weights as integers. Note our math assumes the sum of weights within a grouping does not exceed max uint32,
65+
which is mandated in the XDS protocol.
66+
67+
## Implementation
68+
69+
TBD
70+

0 commit comments

Comments
 (0)