Skip to content

[LFXv2-608] Add OpenTelemetry tracing support#22

Merged
bramwelt merged 14 commits intomainfrom
bramwelt/otel-implementation
Feb 9, 2026
Merged

[LFXv2-608] Add OpenTelemetry tracing support#22
bramwelt merged 14 commits intomainfrom
bramwelt/otel-implementation

Conversation

@bramwelt
Copy link
Contributor

This pull request adds comprehensive OpenTelemetry (OTel) tracing, metrics, and logging support to the LFX V2 Auth Service. It introduces new configuration options for OTel in the Helm chart, updates the deployment to inject OTel-related environment variables, instruments the HTTP server for tracing, and enhances logging to include trace context. Additionally, it adds and updates dependencies required for OTel support.

OpenTelemetry Integration:

  • Added OTel SDK initialization in main.go, reading configuration from environment variables and ensuring proper shutdown on exit. (cmd/server/main.go cmd/server/main.goR51-R65)
  • Instrumented the HTTP server with OTel middleware to automatically generate traces for incoming requests. (cmd/server/http.go [1] [2]

Helm Chart and Configuration Enhancements:

Logging Improvements:

  • Wrapped the default logger with slog-otel to automatically include trace_id and span_id in logs, and added a utility for extracting trace context from context.Context. (pkg/log/log.go [1] [2]; pkg/logging/context.go [3]

Dependency Updates:

  • Added all required OpenTelemetry Go modules and the slog-otel package to go.mod. (go.mod [1] [2] [3] [4]

These changes lay the foundation for distributed tracing, metrics, and context-rich logging, making it easier to observe, debug, and monitor the authentication service in production.

Integrate slog-otel to automatically include trace_id and span_id
from OpenTelemetry context in log output, enabling correlation
between logs and distributed traces.

Changes:
- pkg/log/log.go: Wrap JSON handler with slogotel.OtelHandler
- pkg/utils/otel_test.go: Fix TestNewPropagator to pass OTelConfig
- charts/: Bump version to 0.3.5
- go.mod: Add github.com/remychantenay/slog-otel v1.3.4

Issue: LFXV2-608

Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt bramwelt requested a review from a team as a code owner January 14, 2026 15:16
Copilot AI review requested due to automatic review settings January 14, 2026 15:16
@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

Walkthrough

Adds OpenTelemetry: new OTEL setup and tests, initializes OTEL at startup with graceful shutdown, instruments server and client HTTP paths and logging for trace context, exposes OTEL Helm values, embeds build metadata via ldflags, and updates dependencies and CI build steps.

Changes

Cohort / File(s) Summary
OTEL core & tests
pkg/utils/otel.go, pkg/utils/otel_test.go
New comprehensive OTEL implementation: OTelConfig, env parsing, SetupOTelSDK/SetupOTelSDKWithConfig, providers for traces/metrics/logs (OTLP HTTP/gRPC), propagator handling, shutdown aggregation, and extensive unit tests.
Startup & HTTP instrumentation
cmd/server/main.go, cmd/server/http.go
Initialize OTEL at startup (adds Version/BuildTime/GitCommit vars), call SetupOTelSDKWithConfig with deferred shutdown, and wrap HTTP handlers with otelhttp for tracing.
Logging integration & tests
pkg/log/log.go, pkg/log/log_test.go, internal/infrastructure/authelia/user.go
Integrates slog-otel into logging pipeline to inject trace/span ids, updates logger initialization and tests, and replaces one slog.Warn call with slog.WarnContext.
HTTP client instrumentation
pkg/httpclient/client.go
Wraps outbound HTTP transport with otelhttp.NewTransport to instrument client requests.
Helm chart & values
charts/lfx-v2-auth-service/Chart.yaml, charts/lfx-v2-auth-service/templates/deployment.yaml, charts/lfx-v2-auth-service/values.yaml
Bumps chart version; adds app.extraEnv and app.otel.* values; conditionally injects OTEL env vars into Deployment template.
Build metadata & ko config
.ko.yaml, .github/workflows/ko-build-branch.yaml, .github/workflows/ko-build-main.yaml, .github/workflows/ko-build-tag.yaml
Adds .ko.yaml for ldflags to embed VERSION/BUILD_TIME/GIT_COMMIT; updates GitHub Actions build steps to compute/export build metadata and pass tags to ko.
Dependencies
go.mod
Adds/updates OpenTelemetry modules (otel core, OTLP exporters, otelhttp, slog-otel, Jaeger propagator) and several indirect dependency upgrades.

Sequence Diagram(s)

sequenceDiagram
    participant App as Auth Service (main)
    participant OTEL as OpenTelemetry SDK
    participant Exporters as OTLP Exporters
    participant HTTP as HTTP Handler / Client
    participant Logging as Logging Pipeline

    Note over App: Startup
    App->>OTEL: OTelConfigFromEnv()
    App->>OTEL: SetupOTelSDKWithConfig(ctx, cfg)
    activate OTEL
    OTEL->>Exporters: Init traces/metrics/logs providers
    OTEL-->>App: shutdown func
    deactivate OTEL

    Note over App: Incoming request
    HTTP->>OTEL: Extract trace context (server)
    HTTP->>Logging: Emit logs (slog-otel adds trace_id/span_id)
    Logging->>Exporters: Export logs/traces
    HTTP-->>App: Response

    Note over App: Outbound request
    App->>HTTP: Outbound HTTP client request
    HTTP->>OTEL: Inject trace context (otelhttp transport)
    HTTP->>Exporters: Export spans

    Note over App: Shutdown
    App->>OTEL: shutdown(ctx)
    OTEL->>Exporters: Flush & close
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding OpenTelemetry tracing support to the auth service.
Description check ✅ Passed The description provides comprehensive details about OpenTelemetry integration, Helm chart enhancements, logging improvements, and dependency updates, directly relating to the changeset.
Linked Issues check ✅ Passed The PR successfully implements all primary coding objectives from LFXV2-608: adds OTel dependencies (go.mod), initializes OTel SDK with environment-driven configuration (main.go), instruments HTTP server with OTel middleware (http.go), enriches logging with trace context (log.go), and provides trace context propagation utilities.
Out of Scope Changes check ✅ Passed All changes align with the stated objective of adding OpenTelemetry tracing support. Helm chart updates, logging instrumentation, build metadata handling, and test coverage are all in scope for this OTel integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bramwelt/otel-implementation

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@go.mod`:
- Around line 14-29: The OpenTelemetry versions in go.mod are inconsistent:
align go.opentelemetry.io/otel, go.opentelemetry.io/otel/sdk,
go.opentelemetry.io/otel/sdk/metric and go.opentelemetry.io/otel/trace to the
same released SDK version (e.g., change all v1.39.0 entries to the supported
v1.38.0) and ensure exporter modules
go.opentelemetry.io/otel/exporters/otlp/otlptrace/* and
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/* match that SDK release;
keep otlplog (go.opentelemetry.io/otel/exporters/otlp/otlplog/*) and
go.opentelemetry.io/otel/log or go.opentelemetry.io/otel/sdk/log at their
compatible log-module version (e.g., v0.15.0) if that is required, or
bump/downgrade them to a set of consistent, tested versions—update the go.mod
entries for the modules listed (e.g., go.opentelemetry.io/otel,
go.opentelemetry.io/otel/sdk,
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc,
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc,
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc,
go.opentelemetry.io/otel/log, go.opentelemetry.io/otel/sdk/log) so they are
consistent, then run go mod tidy to verify.
🧹 Nitpick comments (6)
pkg/utils/otel.go (1)

282-319: Consider explicit protocol validation.

The else branch (line 296) treats any non-HTTP protocol value as gRPC. If an invalid protocol value is configured (e.g., OTEL_EXPORTER_OTLP_PROTOCOL=invalid), the code will silently use gRPC instead of warning or failing.

💡 Optional: Add explicit protocol validation
 func newTraceProvider(ctx context.Context, cfg OTelConfig, res *resource.Resource) (*trace.TracerProvider, error) {
 	var exporter trace.SpanExporter
 	var err error

-	if cfg.Protocol == OTelProtocolHTTP {
+	switch cfg.Protocol {
+	case OTelProtocolHTTP:
 		opts := []otlptracehttp.Option{}
 		if cfg.Endpoint != "" {
 			opts = append(opts, otlptracehttp.WithEndpoint(cfg.Endpoint))
 		}
 		if cfg.Insecure {
 			opts = append(opts, otlptracehttp.WithInsecure())
 		}
 		exporter, err = otlptracehttp.New(ctx, opts...)
-	} else {
+	case OTelProtocolGRPC:
 		opts := []otlptracegrpc.Option{}
 		if cfg.Endpoint != "" {
 			opts = append(opts, otlptracegrpc.WithEndpoint(cfg.Endpoint))
 		}
 		if cfg.Insecure {
 			opts = append(opts, otlptracegrpc.WithInsecure())
 		}
 		exporter, err = otlptracegrpc.New(ctx, opts...)
+	default:
+		return nil, fmt.Errorf("unsupported OTLP protocol: %s", cfg.Protocol)
 	}
pkg/utils/otel_test.go (2)

120-132: Test for empty string may not work as intended.

In the table-driven test, when tt.envValue is empty string, the t.Setenv is skipped (line 122-124). However, t.Setenv from a previous subtest in the same parent test may still have set OTEL_TRACES_SAMPLE_RATIO. Since t.Setenv cleanup happens at the end of each subtest, and subtests run sequentially, this should be fine. But the logic is confusing - the empty string case actually tests "no env var set" rather than "env var set to empty string".

Consider clarifying the test name or explicitly testing both scenarios:

Suggested clarification
-		{"empty string", "", 1.0},              // defaults to 1.0
+		{"unset env var", "", 1.0},             // defaults to 1.0 when not set

If you also want to test the behavior when the env var is explicitly set to an empty string, add a separate test case that always calls t.Setenv.


152-164: Same empty string test concern as above.

The empty string case (line 146) won't call t.Setenv, so it tests the "unset" scenario rather than "set to empty string". Consider renaming for clarity:

-		{"empty", "", false},
+		{"unset", "", false},
charts/lfx-v2-auth-service/templates/deployment.yaml (1)

57-60: Conditional check may have unintended behavior for insecure field.

The check {{- if .Values.app.otel.insecure }} evaluates the string value. Since insecure defaults to "false" in values.yaml (a non-empty string), this condition is always true, and OTEL_EXPORTER_OTLP_INSECURE will always be set.

This is likely the intended behavior (explicitly setting false is clearer than omitting the variable), but consider using a more explicit pattern for boolean-like fields:

Alternative: Use `ne ""` for explicit empty check
-          {{- if .Values.app.otel.insecure }}
+          {{- if ne .Values.app.otel.insecure "" }}
           - name: OTEL_EXPORTER_OTLP_INSECURE
             value: {{ .Values.app.otel.insecure | quote }}
           {{- end }}

Or document that the field should be set to empty string "" to omit the env var entirely.

pkg/logging/context.go (2)

13-37: Potentially redundant with slog-otel handler in pkg/log/log.go.

The slogotel.OtelHandler in pkg/log/log.go already automatically injects trace_id and span_id from the context into all log records. These helper functions provide the same functionality but require explicit usage.

Consider documenting when to use these helpers vs relying on the automatic injection:

  • Automatic (via OtelHandler): Default behavior when using slog.Default() with a context-aware logging call
  • Manual (via these helpers): When creating child loggers or using loggers that bypass the default handler

If these helpers are intended for specific use cases not covered by the automatic injection, consider adding a comment explaining when to use them:

// LogWithContext returns a logger with trace context attributes added.
// Use this when you need to create a child logger that persists trace context
// across multiple log calls, or when using a logger that doesn't have
// the OtelHandler wrapper.

28-37: Consider nil logger handling.

LogWithContext will panic if logger is nil. While callers should ensure a valid logger, a defensive nil check could prevent runtime errors:

Defensive nil check
 func LogWithContext(ctx context.Context, logger *slog.Logger) *slog.Logger {
+	if logger == nil {
+		logger = slog.Default()
+	}
 	spanCtx := trace.SpanContextFromContext(ctx)
 	if !spanCtx.IsValid() {
 		return logger
 	}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7926978 and 25e8164.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (10)
  • charts/lfx-v2-auth-service/Chart.yaml
  • charts/lfx-v2-auth-service/templates/deployment.yaml
  • charts/lfx-v2-auth-service/values.yaml
  • cmd/server/http.go
  • cmd/server/main.go
  • go.mod
  • pkg/log/log.go
  • pkg/logging/context.go
  • pkg/utils/otel.go
  • pkg/utils/otel_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/server/main.go (1)
pkg/utils/otel.go (2)
  • OTelConfigFromEnv (84-161)
  • SetupOTelSDKWithConfig (171-239)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: CodeQL analysis (go)
  • GitHub Check: Agent
  • GitHub Check: MegaLinter
  • GitHub Check: Build and Test
  • GitHub Check: Publish Container
🔇 Additional comments (14)
charts/lfx-v2-auth-service/Chart.yaml (1)

8-8: LGTM!

Version bump to 0.3.5 appropriately reflects the OpenTelemetry configuration additions to the chart.

pkg/utils/otel.go (5)

32-80: LGTM!

Well-documented constants and configuration struct with clear environment variable mappings for each field.


82-161: LGTM!

The environment variable parsing is well-implemented with sensible defaults. The sample ratio validation (lines 116-129) properly handles invalid values with warnings.


171-239: LGTM!

Solid implementation of the SDK setup with proper:

  • Error aggregation via errors.Join for shutdown
  • Cleanup on initialization failure via handleErr
  • Conditional provider initialization based on exporter configuration
  • Global provider registration

253-280: LGTM!

Good defensive implementation with fallback to default propagators if none are valid, and appropriate warning logs for unknown propagator values.


321-393: LGTM!

Metrics and logger provider implementations follow consistent patterns. The 30-second periodic reader interval for metrics is a reasonable default for production use.

cmd/server/http.go (1)

64-65: LGTM!

The otelhttp.NewHandler wrapper is correctly placed after the debug middleware, ensuring all requests are traced. The operation name "auth-service" serves as a reasonable top-level span name, with individual route details captured via standard HTTP semantic conventions (method, route, status code).

cmd/server/main.go (1)

51-64: LGTM!

OpenTelemetry initialization is correctly positioned early in the startup sequence, ensuring traces are available for subsequent operations. Key observations:

  • Using context.Background() for shutdown (line 61) is correct since the original context may be cancelled at that point.
  • The deferred shutdown will execute after the graceful shutdown sequence completes, ensuring all spans are flushed.
pkg/utils/otel_test.go (2)

353-372: Good documentation of zero-value behavior.

This test effectively documents an important API contract: zero-value OTelConfig does not disable exporters. Users must explicitly set exporters to OTelExporterNone. This is valuable for preventing silent misconfiguration.


1-399: Comprehensive test coverage for OTel configuration.

The test suite provides thorough coverage of:

  • Default configuration values
  • Environment variable parsing (including edge cases)
  • SDK initialization and shutdown (including idempotency)
  • Resource and propagator creation
  • API constant stability
charts/lfx-v2-auth-service/templates/deployment.yaml (1)

38-80: OTEL environment variable injection looks well-structured.

The conditional pattern ensures env vars are only set when values are provided, keeping the container environment clean when OTEL is disabled. The extraEnv block (lines 38-40) allows additional flexibility for custom configurations.

charts/lfx-v2-auth-service/values.yaml (2)

144-178: Well-documented OTEL configuration with secure defaults.

The configuration block provides:

  • Clear inline documentation for each field
  • Sensible defaults (exporters disabled, gRPC protocol, full sampling)
  • Proper typing with quoted strings for Helm templating

The default propagators: "tracecontext,baggage,jaeger" includes Jaeger for compatibility with legacy tracing systems, which is a good choice for gradual migration.


141-143: Flexible extraEnv injection point.

The extraEnv array allows operators to inject additional environment variables without modifying the chart, which is useful for environment-specific configurations or secrets from external sources.

pkg/log/log.go (1)

106-112: Handler wrapping order is correct.

The logging pipeline correctly chains: JSONHandlerOtelHandlercontextHandler. This ensures:

  1. Base JSON formatting
  2. OTel trace_id/span_id injection from span context
  3. Application-specific context attributes via AppendCtx

slogotel.OtelHandler is correctly used as a value type; all required methods (Handle, WithAttrs, WithGroup) use value receivers in the slog-otel library.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive OpenTelemetry (OTel) instrumentation to the LFX V2 Auth Service, enabling distributed tracing, metrics collection, and context-rich logging. The implementation includes flexible configuration through environment variables and Helm chart values, with all telemetry exporters disabled by default for safe deployment.

Changes:

  • Added OpenTelemetry SDK initialization with support for OTLP exporters (traces, metrics, logs) via both gRPC and HTTP protocols
  • Instrumented HTTP server with automatic trace generation and added trace context (trace_id, span_id) to structured logs
  • Introduced Helm chart configuration for OTel settings with environment variable injection in the deployment template

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
pkg/utils/otel.go Core OTel SDK setup with configuration parsing, exporter initialization, and resource/propagator management
pkg/utils/otel_test.go Comprehensive test coverage for OTel configuration and SDK initialization
pkg/logging/context.go Utility functions for extracting and adding trace context to logs
pkg/log/log.go Wrapped logger with slog-otel handler to automatically include trace context
cmd/server/main.go Added OTel SDK initialization and graceful shutdown in service startup
cmd/server/http.go Instrumented HTTP handler with otelhttp middleware for automatic tracing
charts/lfx-v2-auth-service/values.yaml Added OTel configuration section with defaults for service name, exporters, and sampling
charts/lfx-v2-auth-service/templates/deployment.yaml Template updates to inject OTel environment variables from Helm values
charts/lfx-v2-auth-service/Chart.yaml Bumped chart version to 0.3.5
go.mod Added OpenTelemetry dependencies and slog-otel package
go.sum Updated checksums for new dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +149 to +152
# serviceVersion is the service version for OpenTelemetry resource
# identification
# (default: "1.0.0")
serviceVersion: ""
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment for serviceVersion says "(default: "1.0.0")" but the actual default in the Go code (pkg/utils/otel.go line 90) is an empty string, not "1.0.0". The comment is misleading and should either be removed or corrected to match the actual implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +177 to +184
shutdown = func(ctx context.Context) error {
var err error
for _, fn := range shutdownFuncs {
err = errors.Join(err, fn(ctx))
}
shutdownFuncs = nil
return err
}
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shutdown function captures the shutdownFuncs slice and sets it to nil after calling all shutdown functions (line 182). However, if shutdown is called concurrently from multiple goroutines, this could lead to a race condition where one goroutine is iterating over shutdownFuncs while another sets it to nil. While the defer in main.go (line 60-64) suggests single-threaded shutdown, the public API doesn't document or enforce this constraint. Consider adding synchronization (e.g., sync.Once) to ensure thread-safe shutdown behavior.

Copilot uses AI. Check for mistakes.
Include jaeger propagator in default propagators alongside tracecontext and
baggage for better trace context propagation compatibility. Update helm
deployment template to only set OTEL env vars when endpoint is configured.
Also updates OpenTelemetry and related dependencies to latest versions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
- Add Version, BuildTime, GitCommit build-time variables
- Update OTEL shutdown to use context with timeout
- Add ldflags to ko-build workflows for version injection
- ServiceVersion falls back to build-time Version if env var empty

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt bramwelt force-pushed the bramwelt/otel-implementation branch from 6eed861 to de5d5b0 Compare February 3, 2026 17:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/ko-build-tag.yaml:
- Around line 56-63: The GOFLAGS definition repeats -ldflags which causes only
the last -ldflags to take effect; change the GOFLAGS export so all -X
assignments (main.Version, main.BuildTime, main.GitCommit) are combined into a
single -ldflags string (i.e. one -ldflags with all three -X=... entries) before
running ko build for github.com/linuxfoundation/lfx-v2-auth-service/cmd/server
so Version, BuildTime and GitCommit are all embedded.

bramwelt and others added 2 commits February 3, 2026 10:06
GOFLAGS parses values as space-separated flags. Use separate
-ldflags entries with = syntax for each -X flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Test coverage for AppendCtx, Priority, PriorityCritical,
contextHandler.Handle, and InitStructureLogConfig functions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt bramwelt force-pushed the bramwelt/otel-implementation branch from de5d5b0 to b4ae39d Compare February 3, 2026 18:10
bramwelt and others added 4 commits February 3, 2026 10:16
Declare and assign separately to avoid masking return values.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
- Delete pkg/logging/context.go (slog-otel handles this)
- Use WarnContext in authelia/user.go for trace correlation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Wrap HTTP client transport with otelhttp.NewTransport to
enable distributed tracing for outgoing API calls.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
andrest50
andrest50 previously approved these changes Feb 4, 2026
bramwelt and others added 2 commits February 5, 2026 13:07
Run go mod tidy to move otel/trace to indirect.
Replace empty slice literals with var declarations
and trim propagator string before switch statement.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@pkg/log/log_test.go`:
- Around line 287-342: TestInitStructureLogConfig mutates global slog state via
InitStructureLogConfig and leaks into other tests; fix by capturing the current
default logger at the start of each subtest (using slog.Default()) and restoring
it after the subtest (using slog.SetDefault(saved))—i.e., inside each t.Run
before calling InitStructureLogConfig save prev := slog.Default() and defer
slog.SetDefault(prev) so the global logger state is restored after the subtest.

bramwelt and others added 3 commits February 7, 2026 11:20
The otel SDK v1.40.0 uses schema v1.39.0 internally.
Using semconv v1.37.0 caused a conflicting Schema URL
error in tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
The version is set at build-time via ldflags, not
hardcoded to "1.0.0".

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Save and restore slog.Default() in each subtest of
TestInitStructureLogConfig to prevent InitStructureLogConfig
from leaking global logger state into other tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-608
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@pkg/utils/otel.go`:
- Around line 82-161: OTelConfigFromEnv currently accepts any
OTEL_EXPORTER_OTLP_PROTOCOL value and silently defaults to OTelProtocolGRPC;
update the handling of the local variable protocol to normalize (strings.ToLower
and trim spaces) and validate against supported values ("grpc", "http/protobuf",
"http/json") (also treat "http" as alias for "http/protobuf"), set protocol to
the matching canonical constant (e.g. OTelProtocolGRPC or OTelProtocolHTTP with
subtype) and emit a slog.Warn when an unsupported value is provided before
falling back to OTelProtocolGRPC; make the changes where protocol is read and
defaulted in OTelConfigFromEnv so the returned OTelConfig.Protocol is always one
of the supported canonical values.

Comment on lines +82 to +161
// OTelConfigFromEnv creates an OTelConfig from environment variables.
// See OTelConfig struct fields for supported environment variables.
func OTelConfigFromEnv() OTelConfig {
serviceName := os.Getenv("OTEL_SERVICE_NAME")
if serviceName == "" {
serviceName = "lfx-v2-auth-service"
}

serviceVersion := os.Getenv("OTEL_SERVICE_VERSION")

protocol := os.Getenv("OTEL_EXPORTER_OTLP_PROTOCOL")
if protocol == "" {
protocol = OTelProtocolGRPC
}

endpoint := os.Getenv("OTEL_EXPORTER_OTLP_ENDPOINT")

insecure := os.Getenv("OTEL_EXPORTER_OTLP_INSECURE") == "true"

tracesExporter := os.Getenv("OTEL_TRACES_EXPORTER")
if tracesExporter == "" {
tracesExporter = OTelExporterNone
}

metricsExporter := os.Getenv("OTEL_METRICS_EXPORTER")
if metricsExporter == "" {
metricsExporter = OTelExporterNone
}

logsExporter := os.Getenv("OTEL_LOGS_EXPORTER")
if logsExporter == "" {
logsExporter = OTelExporterNone
}

tracesSampleRatio := 1.0
if ratio := os.Getenv("OTEL_TRACES_SAMPLE_RATIO"); ratio != "" {
if parsed, err := strconv.ParseFloat(ratio, 64); err == nil {
if parsed >= 0.0 && parsed <= 1.0 {
tracesSampleRatio = parsed
} else {
slog.Warn("OTEL_TRACES_SAMPLE_RATIO must be between 0.0 and 1.0, using default 1.0",
"provided-value", ratio)
}
} else {
slog.Warn("invalid OTEL_TRACES_SAMPLE_RATIO value, using default 1.0",
"provided-value", ratio, "error", err)
}
}

propagators := os.Getenv("OTEL_PROPAGATORS")
if propagators == "" {
propagators = "tracecontext,baggage,jaeger"
}

slog.With(
"service-name", serviceName,
"version", serviceVersion,
"protocol", protocol,
"endpoint", endpoint,
"insecure", insecure,
"traces-exporter", tracesExporter,
"traces-sample-ratio", tracesSampleRatio,
"metrics-exporter", metricsExporter,
"logs-exporter", logsExporter,
"propagators", propagators,
).Debug("OTelConfig")

return OTelConfig{
ServiceName: serviceName,
ServiceVersion: serviceVersion,
Protocol: protocol,
Endpoint: endpoint,
Insecure: insecure,
TracesExporter: tracesExporter,
TracesSampleRatio: tracesSampleRatio,
MetricsExporter: metricsExporter,
LogsExporter: logsExporter,
Propagators: propagators,
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the full otel.go file to understand the constants and usage
fd -t f "otel.go" --exec wc -l {} \;

Repository: linuxfoundation/lfx-v2-auth-service

Length of output: 102


🏁 Script executed:

# Read the otel.go file
cat -n pkg/utils/otel.go | head -200

Repository: linuxfoundation/lfx-v2-auth-service

Length of output: 8092


🌐 Web query:

OpenTelemetry OTEL_EXPORTER_OTLP_PROTOCOL specification standard values

💡 Result:

OTEL_EXPORTER_OTLP_PROTOCOL (and the signal-specific variants OTEL_EXPORTER_OTLP_{TRACES|METRICS|LOGS}_PROTOCOL, plus ..._PROFILES_PROTOCOL where supported) is an enum with these standard values:

  • grpc — OTLP over gRPC (protobuf over HTTP/2)
  • http/protobuf — OTLP over HTTP with protobuf payloads
  • http/json — OTLP over HTTP with JSON payloads (optional for SDKs) [1][2]

Default: the specification says the default transport SHOULD be http/protobuf unless an SDK has strong reasons to default to grpc (e.g., backward compatibility), so the effective default can be SDK-dependent. [1][2]


🏁 Script executed:

# Search for how the protocol is used in the setup function
cat -n pkg/utils/otel.go | sed -n '171,250p'

Repository: linuxfoundation/lfx-v2-auth-service

Length of output: 2908


🏁 Script executed:

# Continue reading to see full protocol usage
cat -n pkg/utils/otel.go | sed -n '200,300p'

Repository: linuxfoundation/lfx-v2-auth-service

Length of output: 3834


🏁 Script executed:

# Check the rest of protocol usage for metrics and logs providers
cat -n pkg/utils/otel.go | sed -n '300,395p'

Repository: linuxfoundation/lfx-v2-auth-service

Length of output: 3367


Normalize and validate OTEL_EXPORTER_OTLP_PROTOCOL to handle standard values like http/protobuf.

The current code accepts any protocol value without validation. When set to standard values like http/protobuf or http/json, the equality checks against OTelProtocolHTTP ("http") fail, silently falling back to gRPC. This causes telemetry to be sent via the wrong protocol, leading to connection failures or dropped data. Normalize protocol values, validate against supported types (grpc, http/protobuf, http/json), and warn on unsupported values.

🔧 Proposed fix (normalize and validate protocol)
-	protocol := os.Getenv("OTEL_EXPORTER_OTLP_PROTOCOL")
-	if protocol == "" {
-		protocol = OTelProtocolGRPC
-	}
+	protocol := strings.ToLower(strings.TrimSpace(os.Getenv("OTEL_EXPORTER_OTLP_PROTOCOL")))
+	switch protocol {
+	case "", "grpc":
+		protocol = OTelProtocolGRPC
+	case "http", "http/protobuf", "http/json":
+		protocol = OTelProtocolHTTP
+	default:
+		slog.Warn("unsupported OTEL_EXPORTER_OTLP_PROTOCOL, defaulting to grpc",
+			"protocol", protocol)
+		protocol = OTelProtocolGRPC
+	}
🤖 Prompt for AI Agents
In `@pkg/utils/otel.go` around lines 82 - 161, OTelConfigFromEnv currently accepts
any OTEL_EXPORTER_OTLP_PROTOCOL value and silently defaults to OTelProtocolGRPC;
update the handling of the local variable protocol to normalize (strings.ToLower
and trim spaces) and validate against supported values ("grpc", "http/protobuf",
"http/json") (also treat "http" as alias for "http/protobuf"), set protocol to
the matching canonical constant (e.g. OTelProtocolGRPC or OTelProtocolHTTP with
subtype) and emit a slog.Warn when an unsupported value is provided before
falling back to OTelProtocolGRPC; make the changes where protocol is read and
defaulted in OTelConfigFromEnv so the returned OTelConfig.Protocol is always one
of the supported canonical values.

@bramwelt bramwelt merged commit 25c359c into main Feb 9, 2026
6 checks passed
@bramwelt bramwelt deleted the bramwelt/otel-implementation branch February 9, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants