Skip to content

[DOC] Add documentation for agent / gen-ai trace enrichment in existing otel_traces processor #11976

Description

@kylehounslow

Description

Add documentation for the GenAI agent trace enrichment behavior built into the otel_traces processor in RFC #6542 / PR #6548.

The otel_traces processor now automatically normalizes vendor-specific GenAI span attributes to OTel semantic conventions, propagates key gen_ai.* attributes from child spans to root spans, aggregates token usage counts, and strips flattened sub-keys that conflict with parent string values (e.g. llm.input_messages.0.message.content when llm.input_messages is already a string). This enrichment enables the agent trace visualizations in OpenSearch Dashboards (OpenSearch-Dashboards#11387).

What to document

Always-on enrichment (requires output_format: otel)

  • GenAI enrichment runs automatically on every batch. It is a no-op for traces without gen_ai.* attributes, so there is no impact on existing non-GenAI pipelines.
  • The source must output spans with original OTel attribute key names (e.g. gen_ai.operation.name, not span.attributes.gen_ai@operation@name). This requires:
    • otel_trace_source with output_format: otel (the default is opensearch, which transforms keys and breaks enrichment), OR
    • otlp source, which defaults to output_format: otel and works out of the box.
  • The OpenSearch sink must use index_type: trace-analytics-plain-raw to match the otel output format. The default trace-analytics-raw expects the opensearch key format.
  • Example pipeline configuration using otlp source (recommended):
    otlp-pipeline:
      source:
        otlp:
          ssl: false
      route:
        - traces: 'getEventType() == "TRACE"'
      sink:
        - pipeline:
            name: "traces-raw-pipeline"
            routes:
              - "traces"
    
    traces-raw-pipeline:
      source:
        pipeline:
          name: "otlp-pipeline"
      processor:
        - otel_traces:
            # Time in seconds to buffer child spans waiting for root span (default: 180)
            trace_flush_interval: 180
      sink:
        - opensearch:
            index_type: trace-analytics-plain-raw
            # ...

Vendor attribute normalization

Automatically maps vendor-specific attributes to OTel GenAI Semantic Conventions (v1.39.0). Original attributes are preserved; normalized attributes are added alongside. Normalization is skipped if the target attribute already exists on the span.

For the current ground truth, see genai-attribute-mappings.yaml.

OpenInference mappings

Source attribute Target attribute Notes
llm.token_count.prompt gen_ai.usage.input_tokens
llm.token_count.completion gen_ai.usage.output_tokens
llm.model_name gen_ai.request.model
llm.provider gen_ai.provider.name
llm.input_messages gen_ai.input.messages
llm.output_messages gen_ai.output.messages
embedding.model_name gen_ai.request.model
tool.name gen_ai.tool.name
tool.description gen_ai.tool.description
tool_call.function.arguments gen_ai.tool.call.arguments
tool_call.id gen_ai.tool.call.id
reranker.model_name gen_ai.request.model
agent.name gen_ai.agent.name
session.id gen_ai.conversation.id
openinference.span.kind gen_ai.operation.name Value mapped (see below)

OpenLLMetry mappings

Source attribute Target attribute Notes
llm.usage.prompt_tokens gen_ai.usage.input_tokens
llm.usage.completion_tokens gen_ai.usage.output_tokens
llm.request.model gen_ai.request.model
llm.response.model gen_ai.response.model
llm.request.max_tokens gen_ai.request.max_tokens
llm.request.temperature gen_ai.request.temperature
llm.request.top_p gen_ai.request.top_p
llm.top_k gen_ai.request.top_k
llm.frequency_penalty gen_ai.request.frequency_penalty
llm.presence_penalty gen_ai.request.presence_penalty
llm.chat.stop_sequences gen_ai.request.stop_sequences
llm.request.functions gen_ai.tool.definitions
llm.response.finish_reason gen_ai.response.finish_reasons Wrapped as JSON array
llm.response.stop_reason gen_ai.response.finish_reasons Wrapped as JSON array
llm.request.type gen_ai.operation.name Value mapped (see below)
traceloop.span.kind gen_ai.operation.name Value mapped (see below)
traceloop.entity.name gen_ai.agent.name
traceloop.entity.input gen_ai.input.messages
traceloop.entity.output gen_ai.output.messages

gen_ai.operation.name value mappings (case-insensitive)

Source value Mapped value
LLM chat
EMBEDDING / embedding embeddings
CHAIN / workflow / task / AGENT / agent invoke_agent
RETRIEVER / RERANKER / rerank retrieval
TOOL / tool execute_tool
PROMPT / completion text_completion
chat chat

Root span enrichment

  • Propagates gen_ai.system, gen_ai.provider.name, gen_ai.agent.name, gen_ai.request.model, and gen_ai.operation.name from child spans to root span (first-child-wins).
  • Aggregates gen_ai.usage.input_tokens and gen_ai.usage.output_tokens across all child spans to root (sum). Skipped if root already has token counts.
  • Skip-if-present semantics apply to all attributes — existing root values are never overwritten.

Flattened sub-key stripping

  • Strips flattened sub-keys that conflict with parent string values, preventing OpenSearch mapping failures. This applies only to the following four parent keys:
    • llm.input_messages, llm.output_messages, gen_ai.prompt, gen_ai.completion
  • For example, llm.input_messages.0.message.content is stripped when llm.input_messages exists as a string value. If only the sub-keys exist (no parent string), they are preserved.

Known limitations

  • In-memory batch only — the processor buffers child spans until the root span arrives (up to trace_flush_interval, default 180s), then flushes them together. Enrichment runs on this combined batch. However, if the root span arrives before its children, it is flushed immediately without enrichment. Children arriving later in separate batches cannot retroactively enrich an already-flushed root.
  • First-child-wins for string attributes — traces with multiple LLM providers will get the first child's value.

Related

Metadata

Metadata

Assignees

Labels

Backlog - DEVDeveloper assigned to issue is responsible for creating PR.data-prepper

Type

No type
No fields configured for issues without a type.

Projects

Status
Unplanned

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions