-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
W3C implementation generates a new parentID for each request #308
Comments
I think the rationale for this was an attempt to follow the W3C specification as described here: https://www.w3.org/TR/trace-context/#a-traceparent-is-received However gorouter should be following this part of the spec: https://www.w3.org/TR/trace-context/#alternative-processing Since it is specifically acting as a proxy in this case and not reporting its own processing spans.
This is the approach that should be followed to ensure end-to-end traces are propagated correctly. This is how it currently works in the B3 implementation in the gorouter. |
I have not worked on anything CF related for some time, and have forgotten most of the context If I remember correctly, my logic for this was twofold:
Depending on your use-case, the behaviour of gorouter should/could be different, and I have no strong opinion to changing behaviour |
👋 Hi @tlwr! It's good to see you around again. Thank you for opening this @braunsonm, apologizes for the delayed response. Your explanations about why ParentIDs should not be recreated makes sense to me. But, as always, I am worried about breaking people who rely on previous functionality. Though perhaps it could be considered a bug? @ebroberson has been investigating what it would take to introduce zipkin headers for component logs in CF. ❓ @ebroberson - what is your thought on this? In your ideal situation do you agree that gorouter should not be recreating the ParentID? |
Hi @braunsonm! Thanks for bringing this up! The current implementation is in line with the W3C spec, since "proxies MAY choose not to modify it". To me, this implies that the default behavior would be to modify the I can definitely see your use case, though, and support this being a configurable property. I'd love to see this get PR'd. |
I disagree @ebroberson. The line MAY is contextually for cases where the proxy is also reporting its own processing, it is then optional for that component to modify the If you want to read it that way, then the spec says:
The "current operation" in this case is never reported to the tracer.. so I don't think this mode of operation is relevant. I'd love to know the reason you think the gorouter should behave this way, it serves no purpose, and even worse it disconnects spans from any apps using W3C properly. This is definitely a bug. |
My memory is a bit hazy, but I think I was configuring gorouter access logs to log the headers and then shipped the logs off to a tracing system so I could see what gorouter was doing. With the idea to visualise within the gorouter span a route service and the upstream app
For use cases where the gorouter span is not being shipped then breaking the trace is bad
Another feature would be shipping the traces directly without needing a log watcher
Met vriendelijke groet,
Toby Lorne
Sent from Proton Mail for iOS
Op vr 27 jan. , Braunson ***@***.***> schreef:
… I disagree ***@***.***(https://github.com/ebroberson). The line MAY is specifically for cases where the proxy is also reporting its own processing, it is then optional for that component to modify the traceparent. Since the gorouter does not report spans, it should not be modifying traceparent
Please tell me the reason you think the gorouter should behave this way, it serves no purpose, and even worse it disconnects spans from any apps using W3C properly. This is definitely a bug.
—
Reply to this email directly, [view it on GitHub](#308 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AALJ7RDWJAFDCFYJQHRKORTWUQZ4NANCNFSM6AAAAAAT7ULJQQ).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
References to upcoming work subject to change. In some upcoming work for zipkin headers, gorouter will be assigning its own span-id, so the keeping current W3C implementation was for consistency purposes and to keep from potentially breaking projects that may be relying on this functionality. W3C headers unfortunately don't have the pair of |
Again I challenge this @ebroberson.. What user is relying on the fact that gorouter is breaking the connection of spans? Can you actually point to any use case where this would be desired behaviour?
I'm not sure what your point is here... W3C does have the parent span ID in the I'm not intentionally being combative on this issue, but I cannot follow the logic that this would ever be desired behavior or adherent to the specification and this is not the same behaviour as the gorouters own B3 implementation. In the end though I suppose if it's an option that's fine, as long as it is added 😄 |
I'm not saying that it's desired behavior, just existing behavior that must be taken into account. My point there was that zipkin headers do have a Upon further reading of the spec, it looks like the |
…ns (cloudfoundry#308) Golang 1.17 will no longer accept them when url.Parse() is called. Gorouter does not look at/modify the request params, however we're adding warnings to the gorouter log + updating the X-Cf-RouterError header so app devs + operators will be able to trace back requests if they have an app performing strangely Signed-off-by: Geoff Franks <[email protected]> Co-authored-by: Chris Selzo <[email protected]>
Adds the span ID and trace ID for each request to the corresponding HTTPStartStop envelopes as tags, if they're present. Should help operators and developers to track their traces through gorouter (see cloudfoundry/routing-release#308). Also aids corresponding work within the OTel Collector to export HTTPStartStop envelopes as traces. Signed-off-by: Carson Long <[email protected]>
We're working on being able to export traces from gorouter using the otel-collector release. cloudfoundry/gorouter#407 is the first piece of this work. @braunsonm If you're still interested we'd love it if you were able to deploy this after we've got the otel-collector changes done and see if the traces can show up in your tracing system without gaps |
Would definitely be interested in trying it when it's released @mkocher Do you have a roadmap? In order to use it on our side we would want a way to provide basic auth and a tenancy header for Tempo. |
Adds the span ID and trace ID for each request to the corresponding HTTPStartStop envelopes as tags, if they're present. Should help operators and developers to track their traces through gorouter (see cloudfoundry/routing-release#308). Also aids corresponding work within the OTel Collector to export HTTPStartStop envelopes as traces. Signed-off-by: Carson Long <[email protected]>
Issue
As part of this PR: cloudfoundry/gorouter#261
W3C support was added which is great! However it seems like the functionality was decided to make new parentIDs when you receive a new request. This doesn't seem like the correct behaviour for the gorouter since it is only proxying the request and is not a member of the trace. As far as I know, gorouter does not have support for reporting to OTEL or Zipkin tracers and only acts as a helpful proxy with this header support.
https://github.com/cloudfoundry/gorouter/pull/261/files#diff-4c0749740a74100927f204e1e56dfa5334d5a43ae2a57f3d6dfafccfa7c02897R135-R156
The effect of this is traces using W3C are always disconnected when routed through the gorouter because the next hop receives and records a child span that is not connected to any known parent spans.
Affected Versions
All since this PR was merged: cloudfoundry/gorouter#261
Traffic Diagram
Steps to Reproduce
Expected result
ParentIDs should not be recreated, they should be transparently passed to the next hop.
Current result
ParentIDs are always recreated, disconnecting your spans from their children.
Possible Fix
See linked lines above. Next() should not be called or generate a new parentID. cc @tlwr not sure if you still help out with this project or if you could provide some rationale.
The text was updated successfully, but these errors were encountered: