Skip to content

feat(auth): enrich invoke-level telemetry with transport detail#14264

Open
westarle wants to merge 4 commits intogoogleapis:mainfrom
westarle:westarle/tracing-auth-enrichment
Open

feat(auth): enrich invoke-level telemetry with transport detail#14264
westarle wants to merge 4 commits intogoogleapis:mainfrom
westarle:westarle/tracing-auth-enrichment

Conversation

@westarle
Copy link
Copy Markdown
Contributor

@westarle westarle commented Mar 26, 2026

This PR has auth code fill in TelemetryTransportData in order to send details from the transport back up to higher layers. See googleapis/gax-go#496. This in turn allows code in those higher layers to report telemetry information for the last request that is only known by the transport layer.

Changes Made

  • Logical Hostname Extraction: Updated extraction logic to parse logical hostnames and ports directly from req.URL (HTTP) and cc.Target() (gRPC) prior to invoking the transport
    layer.
  • Implicit Port Fallback: Added scheme-based port fallbacks (80 for HTTP, 443 for HTTPS) when handling URLs with implicit ports.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances OpenTelemetry integration by adding metrics support for both gRPC and HTTP transports. For gRPC, a new unary client interceptor is introduced to extract server address and port from the target and populate gax.TransportTelemetryData, utilizing a new extractHostPort helper function. For HTTP, the otelAttributeTransport is updated to similarly extract server address, port, and response status code. New unit tests cover these telemetry functionalities and the extractHostPort helper. Review feedback indicates an improvement opportunity in extractHostPort's error handling when the port is invalid, suggesting it should return the parsed host instead of the full target string, along with a corresponding test case update.

@westarle westarle force-pushed the westarle/tracing-auth-enrichment branch 4 times, most recently from cf67087 to 643756c Compare March 27, 2026 03:05
@westarle westarle marked this pull request as ready for review March 27, 2026 04:17
@westarle westarle requested review from a team as code owners March 27, 2026 04:17
@westarle westarle force-pushed the westarle/tracing-auth-enrichment branch from 099729a to 03340f3 Compare March 29, 2026 19:09
@westarle westarle requested review from a team as code owners March 29, 2026 19:09
@westarle westarle force-pushed the westarle/tracing-auth-enrichment branch from 03340f3 to ce631e8 Compare March 29, 2026 19:13
- Avoid expensive net.SplitHostPort per-RPC in gRPC interceptor
- Extract HTTP host and port directly from req.URL instead of physical connection
- Add scheme-based fallback for implicit HTTP ports
- Add implicit port extraction tests
}
if gax.IsFeatureEnabled("METRICS") {
host, port := extractHostPort(endpoint)
dialOpts = append(dialOpts, grpc.WithChainUnaryInterceptor(openTelemetryUnaryClientInterceptor(host, port)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westarle Not supporting Streaming is intended?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, yes.

if idx := strings.Index(target, "://"); idx != -1 {
target = target[idx+3:]
// Ensure any leading slashes from the scheme suffix are stripped
for strings.HasPrefix(target, "/") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this loop be replaced with strings.TrimLeft(target, "/")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thank you!

if idx := strings.Index(target, "://"); idx != -1 {
target = target[idx+3:]
// Ensure any leading slashes from the scheme suffix are stripped
target = strings.TrimLeft(target, "/")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use net/url.Parse here to parse and separate scheme, authority, and path more reliably?

If a target is dns://8.8.8.8/lb.example.com:443, this manual parsing might inadvertently extract 8.8.8.8/lb.example.com as the address, leaking internal DNS resolver details into public telemetry. It's kind of a contrived example tbh, but still...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the idea of using net/url.Parse but what I found is that it behaves badly on endpoints that aren't well formatted (like it treats bare "googleapis.com/xyz" as the scheme and explodes on eg. "[::1]:443"

It is a great catch on dns://... ; I fixed it.

}
if urlTemplate, ok := callctx.TelemetryFromContext(req.Context(), "url_template"); ok {
attrs = append(attrs, attribute.String("url.template", urlTemplate))
span.SetName(fmt.Sprintf("%s %s", req.Method, urlTemplate))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urlTemplate is guaranteed to be low-cardinality here, right? (Just checking...)

Copy link
Copy Markdown
Contributor Author

@westarle westarle Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes template needs to be low-cardinality like "/{name=projects/*/things/*}"

transportData := gax.ExtractTransportTelemetry(ctx)
if transportData != nil {
if host != "" {
transportData.SetServerAddress(host)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The *TransportTelemetryData pointer usage here and in the HTTP equivalent is not completely thread-safe.

Assuming this code is called via gax.Invoke (which scopes context per call using InjectTransportTelemetry), this is safe. However, if a user uses the gRPC or HTTP wrapper directly for requests in parallel while sharing a base context, it could trigger a data race.

Typical practice (outside of gax.Invoke) is often to reuse a base context (e.g., context.Background()) across multiple concurrent requests, thus with this code possibly writing to the shared data.SetServerAddress at virtually the same time. This risks overwriting request-specific values when the hosts differ. Even worse, concurrent writes to a non-thread-safe struct field can trigger the race detector.

While I don't think that we absolutely need to fix this for usage via gax.Invoke, here's the possible solution:

// 1. Pull the pointer
orig := gax.ExtractTransportTelemetry(req.Context())

// 2. Clone it (Shallow copy is fine for primitive fields)
var clone gax.TransportTelemetryData
if orig != nil {
    clone = *orig 
}

// 3. Set host for THIS request only
clone.SetServerAddress(req.URL.Hostname())

// 4. Scope it to THIS request's context
req = req.WithContext(gax.InjectTransportTelemetry(req.Context(), &clone))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the solution, but it breaks protocol with how invoke works today (today we allocate the struct in Invoke to avoid an allocation in the RPC path.)

I think you're saying that to experience the breakage a customer would have to

  1. grab a raw Clients
  2. inject a TransportTelemetryData into the context used for those clients
  3. make parallel requests on the client

Maybe I can put a comment on TransportTelemetryData to say it's only for requests managed by Invoke?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants