Description
Description:
We are using the kuberay to create Ray Clusters and expose the inbuilt dashboard at port 8265 using an HTTPRoute and 8266 for GRPCRoute.
The operator creates the k8s services bound to those ports when using the RayCluster CRD. The routes are something we are creating, we also bind it a Gateway resource just for Ray clusters and have merged gateways enabled.
When I dump the addresses that the Gateway pod sees, the ray service gets this data
"destination": {
"name": "httproute/ray-clusters/simple-dashboard/rule/0",
"settings": [
{
"weight": 1,
"protocol": "HTTP",
"endpoints": [{ "host": "None", "port": 8265 }],
"ipFamily": "IPv4"
}
]
},
That None in there is failing the validation and as a result all further xDS updates are cancelled.
We are not creating any Backend resources manually or any other resources.
It would be nice if this was logged and a metric emitted that some resources are not being updated vs failing the xDS send as it requires a lot of introspection to find that xDS updates have stopped going.
Environment:
Using 1.3.2 with Envoy 1.33.2
Logs:
2025-04-10T10:28:47.291Z INFO infrastructure runner/runner.go:92 received an update {"runner": "infrastructure"}
2025-04-10T10:28:47.291Z ERROR gateway-api runner/runner.go:190 unable to validate xds ir, skipped sending it {"runner": "gateway-api", "error": "field Address must be a valid IP or FQDN address\nfield Address must be a valid IP or FQDN address"}
2025-04-10T10:28:47.292Z ERROR watchable message/watchutil.go:80 observed an error {"runner": "gateway-api", "error": "field Address must be a valid IP or FQDN address\nfield Address must be a valid IP or FQDN address"