fix(awssd): use namespace-aware parsing for dotted service names#6365
Conversation
|
Welcome @am-ltk! |
|
Hi @am-ltk. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
cbb4482 to
96098fa
Compare
96098fa to
1716ee6
Compare
parseHostname splits at the first dot, which breaks when Cloud Map service names contain dots (e.g. my-app.elb in namespace dev.local). The naive split produces namespace=elb.dev.local instead of namespace=dev.local, causing the record to be silently skipped. Change parseHostname to accept known namespaces and use longest-suffix matching, falling back to the original first-dot split when no namespace matches for backward compatibility. Signed-off-by: Andrew Moes <andrew@moes.dev> Made-with: Cursor
1716ee6 to
30e3a19
Compare
|
/easycla |
ivankatliarchuk
left a comment
There was a problem hiding this comment.
Not sure about the fix. Looks like a patch work, and method parseHostname now returns multiple values, which could be a sing of design that was made when the provider was implemented.
each call site only uses one of the two return values, and the work is duplicated...
┌──────────────────────┬────────────┐
│ Call site │ Uses │
├──────────────────────┼────────────┤
│ changesByNamespaceID │ nsName, _ │
├──────────────────────┼────────────┤
│ submitCreates │ _, srvName │
├──────────────────────┼────────────┤
│ submitDeletes │ _, srvName │
└──────────────────────┴────────────┘
And submitCreates/submitDeletes re-parse hostnames that changesByNamespaceID already parsed to group them - so the namespace matching runs twice per endpoint.
The optimization: since submitCreates/submitDeletes iterate over changesByNamespaceID's result, they already know the nsID. They can look up the namespace name from it and derive the service name with a plain TrimSuffix - no re-parsing needed:
Something like
// build once, before the loop
nsIDToName := make(map[string]string, len(namespaces))
for _, ns := range namespaces {
nsIDToName[aws.ToString(ns.Id)] = aws.ToString(ns.Name)
}
for nsID, changeList := range changesByNamespaceID {
nsName := nsIDToName[nsID]
for _, ch := range changeList {
hostname := strings.TrimSuffix(ch.DNSName, ".")
srvName := strings.TrimSuffix(hostname, "."+nsName)
...
}
}
This eliminates two of the three parseHostname call sites - only changesByNamespaceID keeps it. At that point parseHostname only ever returns the namespace name, so it could be renamed to parseNamespace and simplified back to return a single string.
I played with this PR
|
/ok-to-test |
Coverage Report for CI Build 24352432864Coverage increased (+0.02%) to 80.544%Details
Uncovered ChangesNo uncovered changes found. Coverage Regressions32 previously-covered lines in 1 file lost coverage.
Coverage Stats
💛 - Coveralls |
…dundant calls - Rename parseHostname to parseNamespace returning only the namespace name (single string) since that is the only value changesByNamespaceID needs. - Extract namespaceIDToName helper to build namespace ID-to-name map. - Refactor submitCreates/submitDeletes to derive service names via TrimSuffix using the known namespace name from the map, eliminating two redundant parseHostname calls per endpoint. - Add edge-case tests: trailing dot, hostname-equals-namespace, and single-label hostname. Signed-off-by: Andrew Moes <andrew@moes.dev> Made-with: Cursor
ivankatliarchuk
left a comment
There was a problem hiding this comment.
Could you share similar results for this PR #5085 (comment). Need to make sure it works before we merge
- Remove the AWSSDProvider receiver from parseNamespace since it has no dependency on the provider struct. - Reorder file so all receiver methods precede package-level helpers. - Update call site and test accordingly. Signed-off-by: Andrew Moes <andrew@moes.dev> Made-with: Cursor
external-dns arguments Cloud Map private namespace Create a Service with a dotted hostname annotationkubectl apply -f - <<'EOF'
apiVersion: v1
kind: Namespace
metadata:
name: external-dns-pr-test
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dummy-app
namespace: external-dns-pr-test
spec:
replicas: 1
selector:
matchLabels: { app: dummy-app }
template:
metadata:
labels: { app: dummy-app }
spec:
containers:
- name: busybox
image: busybox
command: ["sh", "-c", "while true; do sleep 3600; done"]
ports: [{ containerPort: 80 }]
---
apiVersion: v1
kind: Service
metadata:
name: dummy-svc
namespace: external-dns-pr-test
annotations:
external-dns.alpha.kubernetes.io/hostname: "pr-test.dotfix.dev.local"
spec:
type: ClusterIP
clusterIP: None
selector: { app: dummy-app }
ports: [{ port: 80 }]
EOFexternal-dns logsService created with the dotted name Cloud Map — service createdaws servicediscovery list-services \
--filters Name=NAMESPACE_ID,Values=<dev.local-ns-id> \
--query 'Services[?Name==`pr-test.dotfix`].{Name:Name,RoutingPolicy:DnsConfig.RoutingPolicy,RecordType:DnsConfig.DnsRecords[0].Type,TTL:DnsConfig.DnsRecords[0].TTL}'[{ "Name": "pr-test.dotfix", "RoutingPolicy": "MULTIVALUE", "RecordType": "A", "TTL": 300 }]Cloud Map — instance registeredaws servicediscovery discover-instances \
--namespace-name dev.local --service-name pr-test.dotfix{
"Instances": [
{
"InstanceId": "10.x.x.x",
"NamespaceName": "dev.local",
"ServiceName": "pr-test.dotfix",
"HealthStatus": "UNKNOWN",
"Attributes": { "AWS_INSTANCE_IPV4": "10.x.x.x" }
}
]
} |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ivankatliarchuk The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Callers already trim trailing dots before invoking parseNamespace, but the function itself silently returns the wrong namespace when given a FQDN with a trailing dot (strings.HasSuffix misses the match). Add a defensive TrimSuffix at the top so parseNamespace is self-contained, and update the test to assert correct behavior. Made-with: Cursor
|
3 weeks since approval /lgtm |
What does it do ?
Changes
parseHostnamein the aws-sd provider to accept known Cloud Map namespaces and use longest-suffix matching instead of naive first-dot splitting. This correctly handles service names containing dots (e.g.my-app.elbin namespacedev.local).Falls back to the original first-dot split when no namespace suffix matches, preserving full backward compatibility for all existing configurations.
Motivation
parseHostnamesplits hostnames at the first dot with no awareness of actual Cloud Map namespace boundaries. Formy-app.elb.dev.localwith namespacedev.local, it produces:my-appmy-app.elbelb.dev.localdev.localmatchingNamespacesthen does an exact-match lookup and finds no namespace namedelb.dev.local, so the record is silently dropped with:This affects any hostname where the Cloud Map service name contains a dot, including SRV records (#5714), where hostnames like
_backend._tcp.backend.mynet.svc.internalare parsed with namespace_tcp.backend.mynet.svc.internalinstead ofmynet.svc.internal.No
--domain-filterworkaround is possible because the issue is structural inparseHostname.Fixes #6364
Ref #5714
More
No documentation update is needed — this is an internal bug fix to hostname parsing logic with no user-facing configuration changes.