Skip to content

distributed tracing. correlation IDs delivery to the user #1253

@ekarankow

Description

@ekarankow

Name and Version

ai-dial-core 0.39.0-rc

What is the problem this feature will solve?

Currently, when errors occur or when requests are processed successfully, there is no consistent way to trace individual requests across distributed services. This makes it difficult to correlate logs, investigate issues, and follow the flow of a request through various components. Without a trace identifier included in responses, developers and support teams lack the necessary context to efficiently debug problems, especially when using tools like Kibana or browser developer tools.

What is the feature you are proposing to solve the problem?

This implementation provides distributed tracing support using OpenTelemetry and the W3C Trace Context standard. The solution ensures that every API response contains trace context information that can be used to trace requests across distributed systems.

Implementation Details:

Response Headers: All HTTP responses include the traceparent header (W3C Trace Context standard) when OpenTelemetry trace context is available. Format: 00-{trace-id}-{span-id}-{trace-flags} (e.g., 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01)

Error Responses: Error responses include a traceparent field directly in the error JSON body:
{
"path": "/api/v1/applications",
"method": "GET",
"status": 404,
"error": "Not Found",
"message": "Application not found",
"traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
}
3. Client Integration: Clients can propagate trace context by sending the traceparent header in requests. The service automatically extracts and propagates the trace context. If no trace context is provided, the service generates a new trace.

OpenTelemetry-Only Approach: The implementation relies exclusively on OpenTelemetry for trace ID generation - no custom correlation ID logic. Trace IDs are extracted from OpenTelemetry span context.

Benefits:

Standards-compliant: Uses W3C Trace Context standard for interoperability
Browser-accessible: Trace IDs visible in browser developer tools (Network tab)
Observability-ready: Enables correlation in Kibana, OpenTelemetry collectors, and other observability tools
Automatic propagation: Works seamlessly with OpenTelemetry-compatible clients

What alternatives have you considered?

Custom correlation ID generation: Rejected in favor of OpenTelemetry-native approach for better standards compliance

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions