Skip to content

Conversation

@artem-tkachuk
Copy link
Contributor

@artem-tkachuk artem-tkachuk commented Oct 30, 2025

Add OpenTelemetry tracing to Spegel

Summary

  • Add OpenTelemetry tracing to registry, metrics, and OCI HTTP flows
  • Centralize HTTP instrumentation in httpx using otelhttp
  • Respect existing OTEL providers/propagators when already configured
  • Expose OTEL config in the Helm chart and wire flags in the DaemonSet
  • Add unit tests covering OTEL setup, sampler behavior, and HTTP propagation

Why

Provide request‑level tracing for debugging and observability without breaking embedders that already configure OTEL.

Details

  • Instrumentation:
    • HTTP handlers (registry, metrics) via httpx.WrapHandler
    • HTTP client transport (OCI requests) via httpx.WrapTransport
    • P2P operations (lookup/bootstrap)
  • Helm: values.yaml exposes .Values.spegel.otel.{endpoint,insecure,serviceName,sampler} and flags are wired into the DaemonSet.
  • OTEL setup avoids overriding global tracer provider/propagator when already set.

Test plan

  • go test ./...
  • golangci-lint run ./...

Impact

  • OTEL is always built in; if a collector is not reachable, exporting may log warnings but does not fail requests.

Screenshots

Screenshot 2026-01-22 at 19 25 57 Screenshot 2026-01-22 at 19 25 32

Follow-up

  • I am working on a separate PR to add the k8s itest tracing setup and document the repro steps.

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

❌ Patch coverage is 70.06803% with 44 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/otelx/otelx_otel.go 75.00% 14 Missing and 7 partials ⚠️
main.go 10.52% 16 Missing and 1 partial ⚠️
internal/otelx/trace.go 71.42% 2 Missing and 2 partials ⚠️
pkg/registry/registry.go 85.71% 2 Missing ⚠️
Flag Coverage Δ
integration-containerd 18.36% <0.00%> (-0.20%) ⬇️
unit 58.01% <70.06%> (+1.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/httpx/otel.go 100.00% <100.00%> (ø)
pkg/oci/client.go 52.09% <100.00%> (ø)
pkg/routing/p2p.go 60.42% <100.00%> (+0.37%) ⬆️
pkg/registry/registry.go 82.91% <85.71%> (+0.28%) ⬆️
internal/otelx/trace.go 71.42% <71.42%> (ø)
main.go 10.21% <10.52%> (+10.21%) ⬆️
internal/otelx/otelx_otel.go 75.00% <75.00%> (ø)

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@artem-tkachuk artem-tkachuk force-pushed the atkachuk/add-otel-tracing branch from 54b0fc2 to 8ff640a Compare October 31, 2025 16:33
Copy link
Member

@phillebaba phillebaba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artem-tkachuk as we discussed during last weeks community meeting. We should remove the optional build tags and then also the docker compose tests. Keep things slim with adding tracing, and then we can add more features as needed. Once you remove the things that are not needed it will reduce the noise and make it easier to review.

What would be nice is to get to a first point where we are setting up otelhttp so that both trace headers are forwarded by the clients and also registered for requests. Does that sound like a good idea?

main.go Outdated
regSrv := &http.Server{
Addr: args.RegistryAddr,
Handler: reg.Handler(log),
Handler: otelx.WrapHandler("registry", reg.Handler(log)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should aim to use go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp and integrate it into the httpx package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also keep in mind that embedding projects may have their own otel config that this will need to play nice with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP instrumentation now uses otelhttp and lives in httpx (handlers + transport), and OTEL setup skips overriding existing tracer providers/propagators if already configured 15df91f

@artem-tkachuk
Copy link
Contributor Author

@phillebaba thank you very much for reviewing! Will work on it today and tomorrow now that I have caught up after coming back from KubeCon :)

@artem-tkachuk artem-tkachuk force-pushed the atkachuk/add-otel-tracing branch 4 times, most recently from bb3575f to dbe1524 Compare January 21, 2026 04:54
@artem-tkachuk
Copy link
Contributor Author

Finished addressing feedback. Remaining items are adding more testing for codecov, providing otel endpoint for k8s itest, cleaning up commits, and updating description.

@artem-tkachuk
Copy link
Contributor Author

@phillebaba, could you add a label to the PR since they're now required?

@artem-tkachuk artem-tkachuk force-pushed the atkachuk/add-otel-tracing branch 2 times, most recently from 03fa395 to 9119446 Compare January 23, 2026 03:49
@artem-tkachuk
Copy link
Contributor Author

Added more testing to improve codecov coverage, cleaned up commits, and updated the description. Ready for re-review!

Add OpenTelemetry module requirements and refresh integration test module
files while keeping golangci-lint import aliasing consistent.
Initialize OTEL tracing with env-backed defaults and provide helper
functions for span creation and log enrichment.
Centralize otelhttp wrapping for handlers and transports, then route
registry and OCI requests through the new helpers.
Plumb OTEL config from flags into setup and add spans around P2P
bootstrap and lookup operations.
Cover override vs. reuse behavior, propagator handling, and sampling
defaults for the OTEL setup flow.
Add OTEL chart values and render them into the daemonset args with a
guard for empty endpoints, then refresh helm docs.
Align documentation and changelog entries with the OTEL integration
behavior and references.
@artem-tkachuk artem-tkachuk force-pushed the atkachuk/add-otel-tracing branch from f06b549 to 4b22ae3 Compare January 23, 2026 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants