-
Notifications
You must be signed in to change notification settings - Fork 337
Description
Is your feature request related to a problem? Please describe.
When running the agent with --write-job-logs-to-stdout and --log-format=json, the agent emits structured ("rich") log lines that already include contextual fields like org, pipeline, build_id, job_id, etc. However, when OpenTelemetry tracing is enabled, there is no way to correlate these log lines back to an active trace or span. This makes it difficult to cross-reference logs with distributed traces in observability backends (e.g. Datadog, Honeycomb, Jaeger).
Describe the solution you'd like
Add trace_id and span_id fields to the jobLogger constructed in NewJobRunner (agent/job_runner.go, around lines :301–319), populated from the active OpenTelemetry span on the context at the time the logger is created. For example something like:
if tp := r.conf.Job.TraceParent; tp != "" {
// W3C traceparent format: {version}-{trace_id}-{span_id}-{flags}
if parts := strings.SplitN(tp, "-", 4); len(parts) == 4 {
log = log.WithFields(
logger.StringField("trace_id", parts[1]),
logger.StringField("span_id", parts[2]),
)
}
}This would allow operators to filter structured logs by trace_id in their log aggregation tool and jump directly to the matching trace.
Describe alternatives you've considered
I haven't come up with any actual alternatives. There's already the BUILDKITE_TRACING_TRACEPARENT environment variable, but that exists inside the agent, not outside it. therefore parsing the logs and injecting the fields from outside the agent isn't possible.
Additional context