Skip to content

Add locality support to the xds control plane.#1

Open
vadimberezniker wants to merge 1 commit intomasterfrom
locality
Open

Add locality support to the xds control plane.#1
vadimberezniker wants to merge 1 commit intomasterfrom
locality

Conversation

@vadimberezniker
Copy link
Copy Markdown
Member

@vadimberezniker vadimberezniker commented Apr 21, 2026

Kubernetes services may be annotated with xds.lmwn.com/locality-preference to indicate how traffic to that service should be load balanced with the following options:

  • zone: clients are matched to backends using only the zone (topology.kubernetes.io/zone) label from the parent node. Zone local endpoints are returned with a higher priority. If no zone-local endpoints are available the client will fallback to endpoints in other zones.
  • sub_zone: clients are matched to backends using both sub_zone (topology.kubernetes.io/rack by default) and zone (topology.kubernetes.io/zone). Endpoints are returned with 3 priority levels, rack-local endpoints with the highest priority, zone local endpoints with the next highest priority, and then all other endpoints.
    Services that are not annotated load balance across all backends.

endpoints.go is the service that processes incoming k8s endpoint information and translates it into xDS resources for use by clients. Prior to this change, it would do a simple conversion and hand it off to the go-control-plane code to do the rest. With the introduction of locality, different clients may receive different assignments.

To that end, endpointsCache (in endpointscache.go) becomes the new "entry point" for clients subscribing to endpoint information. endpoints.go groups endpoints by locality (depending on how the service is configured) and stores the data in the endpointsCache. endpointsCache then takes the stored data and generates the endpoints assignments tailored to the locality information requested by the client.

nodeLocalityStore (nodes.go) subscribes to node information from k8s and stores zone and rack information for each node. If node locality metadata changes, we re-process the endpoints via endpoints.go in case the endpoint locality has been affected.

Kubernetes services may be annotated with `xds.lmwn.com/locality-preference` to indicate how traffic to that service should be load balanced with the following options:
 zone: clients are matched to backends using only the zone (`topology.kubernetes.io/zone`) label from the parent node. Zone local endpoints are returned with a higher priority. If no zone-local endpoints are available the client will fallback to endpoints in other zones.
 sub_zone: clients are matched to backends using both sub_zone (`topology.kubernetes.io/rack` by default) and zone (`topology.kubernetes.io/zone`). Endpoints are returned with 3 priority levels, rack-local endpoints with the highest priority, zone local endpoints with the next highest priority, and then all other endpoints.
Services that are not annotated load balance across all backends.

endpoints.go is the service that processes incoming k8s endpoint information and translates it into xDS resources for use by clients.
Prior to this change, it would do a simple conversion and hand it off to the go-control-plane code to do the rest.
With the introduction of locality, different clients may receive different assignments.

To that end, endpointsCache (in endpointscache.go) becomes the new "entry point" for clients subscribing to endpoint information.
endpoints.go groups endpoints by locality (depending on how the service is configured) and stores the data in the endpointsCache.
endpointsCache then takes the stored data and generates the endpoints assignments tailored to the locality information requested by the client.

nodeLocalityStore (nodes.go) subscribes to node information from k8s and stores zone and rack information for each node.
If node locality metadata changes, we re-process the endpoints via endpoints.go in case the endpoint locality has been affected.
@vadimberezniker vadimberezniker requested a review from vanja-p April 21, 2026 23:07
}

func nodeLocalityFromXds(node *corev3.Node) (zone, subZone string) {
if node != nil && node.GetLocality() != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since node is a proto, node.GetLocality() returns nil if node is nil. Also, node.GetLocality().GetZone() returns "" if node.GetLocality() is nil.

So the whole function can be return node.GetLocality().GetZone(), node.GetLocality().GetSubZone()

}

func splitLocalityKey(k string) (zone, subZone string) {
if idx := strings.Index(k, localityKeySep); idx >= 0 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be

before, after, _ := strings.Cut(k, localityKeySep)
return before, after

}
// score -> groups at that score
bySort := map[int][]*endpointv3.LocalityLbEndpoints{}
scores := make([]int, 0, 3)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

bySort can be make([][]*endpointv3.LocalityLbEndpoints, 3), or even [3][]*endpointv3.LocalityLbEndpoints

Then you don't need scores, because you can iterate over bySort in reverse, and not increment priority when the list is empty. If you reverse the meaning of scores, then you don't need to iterate in reverse either.

endpointv3 "github.com/envoyproxy/go-control-plane/envoy/config/endpoint/v3"
)

const localityKeySep = "|"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

document why this is safe / can't appear in the values.

keys = append(keys, "")
}
for _, k := range keys {
c.setSnapshotForKey(ctx, k, version, resourcesByType)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that there is a race here if setResources and ensureSnapshotForNode are running concurrently, but I'm not sure if it's important:

  1. ensureSnapshotForNode reads version 1
  2. setResources is called with version 2
  3. setResources calls `setSnapshotForKey with version 2
  4. ensureSnapshotForNode calls setSnapshotForKey with version 1
  5. The watch gives version 1 to the client.

Comment thread snapshot/endpoints.go
out = append(out, cla)

sortedAddresses := subset.Addresses
sort.SliceStable(sortedAddresses, func(i, j int) bool {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might as well use slices.SortStable if you're changing this code.

Also, not sure if this was a problem before, but both of these sort the slice in place. They don't make a copy.

Comment thread snapshot/nodes.go
return true
}

func nodeMapsEqual(a, b map[string]nodeLocality) bool {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does maps.Equal work here?

Comment thread snapshot/endpoints.go
}
groups[key] = g
}
g.LbEndpoints = append(g.LbEndpoints, &endpointv3.LbEndpoint{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to do this even if !ok (g is the zero value)?

Comment thread test/integration_test.go
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe there should be tests for changing the xds.lmwn.com/locality-preference value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants