Skip to content

Commit eb0e61c

Browse files
committed
A113: pick_first: Weighted Random Shuffling
1 parent d926ac4 commit eb0e61c

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
A113: pick_first: Weighted Random Shuffling
2+
----
3+
* Author(s): Alex Polcyn (@apolcyn)
4+
* Approver: Mark Roth (@markdroth), Eric Anderson (@ejona86), Doug Fawley (@dfawley), Easwar Swaminathan (@easwars)
5+
* Status: Draft
6+
* Implemented in: <language, ...>
7+
* Last updated: Jan 26, 2026
8+
* Discussion at: <google group thread> (filled after thread exists)
9+
10+
## Abstract
11+
12+
Support weighted random shuffling in the pick first LB policy.
13+
14+
## Background
15+
16+
The pick first LB policy currently supports random shuffling. A primary intention of the feature
17+
is for load balancing, however it does not take (possibly present) locality or endpoint weights
18+
into account. Naturally this can lead to skewed load distribution and hotspots, when the load
19+
balancing control plane delivers varied weights and expects them to be followed.
20+
21+
22+
### Related Proposals:
23+
* [A62](https://github.com/grpc/proposal/blob/master/A62-pick-first.md): pick_first: sticky TRANSIENT_FAILURE and address order randomization
24+
* [A42](https://github.com/grpc/proposal/blob/master/A42-xds-ring-hash-lb-policy.md) xDS Ring Hash LB Policy
25+
26+
## Proposal
27+
28+
### Changes within Pick First
29+
30+
Modify behavior of pick_first when the `shuffle_address_list` option is set, and
31+
perform a weighted random sort *based on per-endpoint weights*:
32+
* Use the [Weighted Random Sampling](https://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf) algorithm
33+
proposed by Efraimidis, Spirakis.
34+
* Set the weight of each endpoint to `u ^ (1 / weight)`, where `u` is a uniform random number in `(0, 1)` and weight
35+
is the weight of the endpoint (as present in a weight attribute). Default to 1 if no weight attribute is present.
36+
37+
### CDS LB Policy changes: Computing Endpoint Weights
38+
39+
In XDS, we have a notion of both locality and endpoint weights. The expectation of the load balancing
40+
control plane is to *first* pick locality and *second* pick endpoint. The total probability distribution
41+
reflected by per-endpoint weights must reflect this. As such, we need to normalize locality weights within
42+
each priority and endpoint weights within locality; the final weight provided to `pick_first` should be a
43+
product of the two normalized weights (i.e. a logical AND of the two selection events).
44+
45+
The CDS LB policy currently calculates per-endpoint weight attributes. It will continue to do so however
46+
we need to fix the mechanics: an endpoint's final weight should be a product of its *normalized* locality
47+
weight and *normalized* endpoint weight, rather than their product outright. Note: as a side effect this
48+
will fix per-endpoint weights in Ring Hash LB, which
49+
[currently](https://github.com/grpc/proposal/blob/master/A42-xds-ring-hash-lb-policy.md) multiply
50+
*raw* locality and endpoint weights.
51+
52+
We can continue to represent weights as integers if we represent their normalized values in
53+
fixed point Q31 format. Math as follows (citation due for @ejona):
54+
55+
1) Normalize a weight within a `weight_sum` as follows: `uint32_t normalized = ((uint64_t)weight * 2 ^ 31) / weight_sum`.
56+
57+
2) Multiply two normalized weights as follows: `weight = ((uint64_t) weight1 * weight2) >> 31`
58+
59+
3) Zero weights should be rounded up to 1.
60+
61+
### Temporary environment variable protection
62+
63+
CDS LB policy and Pick First LB policy behavior changes will be guarded by `GRPC_EXPERIMENTAL_PF_WEIGHTED_SHUFFLING`.
64+
65+
## Rationale
66+
67+
* CDS LB policy changes are needed to generate correct weight distributions, not only for Pick First but
68+
also for Ring Hash
69+
* Using fixed point Q31 format has predictable bounds on precision, and allows us to continue representing
70+
weights as integers. Note our math assumes the sum of weights within a grouping does not exceed max uint32,
71+
which is mandated in the XDS protocol.
72+
73+
## Implementation
74+
75+
TBD
76+

0 commit comments

Comments
 (0)