Add locality support to the xds control plane.#1
Add locality support to the xds control plane.#1vadimberezniker wants to merge 1 commit intomasterfrom
Conversation
Kubernetes services may be annotated with `xds.lmwn.com/locality-preference` to indicate how traffic to that service should be load balanced with the following options: zone: clients are matched to backends using only the zone (`topology.kubernetes.io/zone`) label from the parent node. Zone local endpoints are returned with a higher priority. If no zone-local endpoints are available the client will fallback to endpoints in other zones. sub_zone: clients are matched to backends using both sub_zone (`topology.kubernetes.io/rack` by default) and zone (`topology.kubernetes.io/zone`). Endpoints are returned with 3 priority levels, rack-local endpoints with the highest priority, zone local endpoints with the next highest priority, and then all other endpoints. Services that are not annotated load balance across all backends. endpoints.go is the service that processes incoming k8s endpoint information and translates it into xDS resources for use by clients. Prior to this change, it would do a simple conversion and hand it off to the go-control-plane code to do the rest. With the introduction of locality, different clients may receive different assignments. To that end, endpointsCache (in endpointscache.go) becomes the new "entry point" for clients subscribing to endpoint information. endpoints.go groups endpoints by locality (depending on how the service is configured) and stores the data in the endpointsCache. endpointsCache then takes the stored data and generates the endpoints assignments tailored to the locality information requested by the client. nodeLocalityStore (nodes.go) subscribes to node information from k8s and stores zone and rack information for each node. If node locality metadata changes, we re-process the endpoints via endpoints.go in case the endpoint locality has been affected.
| } | ||
|
|
||
| func nodeLocalityFromXds(node *corev3.Node) (zone, subZone string) { | ||
| if node != nil && node.GetLocality() != nil { |
There was a problem hiding this comment.
nit: since node is a proto, node.GetLocality() returns nil if node is nil. Also, node.GetLocality().GetZone() returns "" if node.GetLocality() is nil.
So the whole function can be return node.GetLocality().GetZone(), node.GetLocality().GetSubZone()
| } | ||
|
|
||
| func splitLocalityKey(k string) (zone, subZone string) { | ||
| if idx := strings.Index(k, localityKeySep); idx >= 0 { |
There was a problem hiding this comment.
nit: this can be
before, after, _ := strings.Cut(k, localityKeySep)
return before, after
| } | ||
| // score -> groups at that score | ||
| bySort := map[int][]*endpointv3.LocalityLbEndpoints{} | ||
| scores := make([]int, 0, 3) |
There was a problem hiding this comment.
nit:
bySort can be make([][]*endpointv3.LocalityLbEndpoints, 3), or even [3][]*endpointv3.LocalityLbEndpoints
Then you don't need scores, because you can iterate over bySort in reverse, and not increment priority when the list is empty. If you reverse the meaning of scores, then you don't need to iterate in reverse either.
| endpointv3 "github.com/envoyproxy/go-control-plane/envoy/config/endpoint/v3" | ||
| ) | ||
|
|
||
| const localityKeySep = "|" |
There was a problem hiding this comment.
document why this is safe / can't appear in the values.
| keys = append(keys, "") | ||
| } | ||
| for _, k := range keys { | ||
| c.setSnapshotForKey(ctx, k, version, resourcesByType) |
There was a problem hiding this comment.
It seems that there is a race here if setResources and ensureSnapshotForNode are running concurrently, but I'm not sure if it's important:
ensureSnapshotForNodereads version 1setResourcesis called with version 2setResourcescalls `setSnapshotForKey with version 2ensureSnapshotForNodecallssetSnapshotForKeywith version 1- The watch gives version 1 to the client.
| out = append(out, cla) | ||
|
|
||
| sortedAddresses := subset.Addresses | ||
| sort.SliceStable(sortedAddresses, func(i, j int) bool { |
There was a problem hiding this comment.
nit: might as well use slices.SortStable if you're changing this code.
Also, not sure if this was a problem before, but both of these sort the slice in place. They don't make a copy.
| return true | ||
| } | ||
|
|
||
| func nodeMapsEqual(a, b map[string]nodeLocality) bool { |
| } | ||
| groups[key] = g | ||
| } | ||
| g.LbEndpoints = append(g.LbEndpoints, &endpointv3.LbEndpoint{ |
There was a problem hiding this comment.
we want to do this even if !ok (g is the zero value)?
There was a problem hiding this comment.
maybe there should be tests for changing the xds.lmwn.com/locality-preference value.
Kubernetes services may be annotated with
xds.lmwn.com/locality-preferenceto indicate how traffic to that service should be load balanced with the following options:zone: clients are matched to backends using only the zone (topology.kubernetes.io/zone) label from the parent node. Zone local endpoints are returned with a higher priority. If no zone-local endpoints are available the client will fallback to endpoints in other zones.sub_zone: clients are matched to backends using both sub_zone (topology.kubernetes.io/rackby default) and zone (topology.kubernetes.io/zone). Endpoints are returned with 3 priority levels, rack-local endpoints with the highest priority, zone local endpoints with the next highest priority, and then all other endpoints.Services that are not annotated load balance across all backends.
endpoints.go is the service that processes incoming k8s endpoint information and translates it into xDS resources for use by clients. Prior to this change, it would do a simple conversion and hand it off to the go-control-plane code to do the rest. With the introduction of locality, different clients may receive different assignments.
To that end, endpointsCache (in endpointscache.go) becomes the new "entry point" for clients subscribing to endpoint information. endpoints.go groups endpoints by locality (depending on how the service is configured) and stores the data in the endpointsCache. endpointsCache then takes the stored data and generates the endpoints assignments tailored to the locality information requested by the client.
nodeLocalityStore (nodes.go) subscribes to node information from k8s and stores zone and rack information for each node. If node locality metadata changes, we re-process the endpoints via endpoints.go in case the endpoint locality has been affected.