|
| 1 | +# OpenTelemetry Compatibility |
| 2 | + |
| 3 | +agentevals consumes OpenTelemetry traces to evaluate AI agents. This document covers which OTel conventions we support, how we handle the ongoing migration from span events to log-based events, and guidance for instrumenting your own agents. |
| 4 | + |
| 5 | +## Supported Semantic Conventions |
| 6 | + |
| 7 | +### OTel GenAI Semantic Conventions (recommended) |
| 8 | + |
| 9 | +The [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) define standard span attributes for LLM interactions. agentevals auto-detects this format when spans contain `gen_ai.request.model` or `gen_ai.input.messages`. |
| 10 | + |
| 11 | +Supported attributes: |
| 12 | + |
| 13 | +| Attribute | Description | |
| 14 | +|-----------|-------------| |
| 15 | +| `gen_ai.request.model` | Model name (e.g. `gpt-4o`, `claude-sonnet-4-6`) | |
| 16 | +| `gen_ai.input.messages` | JSON array of input messages | |
| 17 | +| `gen_ai.output.messages` | JSON array of output messages | |
| 18 | +| `gen_ai.response.finish_reasons` | Why the model stopped generating | |
| 19 | +| `gen_ai.usage.input_tokens` | Input token count | |
| 20 | +| `gen_ai.usage.output_tokens` | Output token count | |
| 21 | +| `gen_ai.system` | AI system identifier (e.g. `openai`, `anthropic`) | |
| 22 | + |
| 23 | +This format works with LangChain, Strands, OpenAI instrumentation, Anthropic instrumentation, and any framework that follows the GenAI semantic conventions. |
| 24 | + |
| 25 | +### Google ADK (framework-native) |
| 26 | + |
| 27 | +Google ADK emits spans under the `gcp.vertex.agent` OTel scope with proprietary attributes (`gcp.vertex.agent.llm_request`, `gcp.vertex.agent.llm_response`, etc.). agentevals has a dedicated converter that auto-detects this format. No GenAI semconv configuration is needed. |
| 28 | + |
| 29 | +### Format Detection |
| 30 | + |
| 31 | +Format detection is automatic. When a trace contains both ADK and GenAI attributes, ADK takes priority because it provides richer structured data. The detection logic lives in `src/agentevals/converter.py` (`get_extractor()`). |
| 32 | + |
| 33 | +## Message Content Delivery |
| 34 | + |
| 35 | +GenAI message content (`gen_ai.input.messages`, `gen_ai.output.messages`) can arrive through three mechanisms. agentevals supports all of them: |
| 36 | + |
| 37 | +### 1. Span attributes (simplest) |
| 38 | + |
| 39 | +Message content is stored directly as span attributes. This is the most straightforward approach and requires no special handling. |
| 40 | + |
| 41 | +### 2. Log records (recommended for new instrumentation) |
| 42 | + |
| 43 | +Message content is emitted as OTel log records correlated with spans via trace context. This is the pattern used by `opentelemetry-instrumentation-openai-v2` and LangChain's GenAI instrumentation. |
| 44 | + |
| 45 | +Requires both `OTLPSpanExporter` and `OTLPLogExporter` (or their streaming equivalents). Set `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true` to enable content capture. |
| 46 | + |
| 47 | +### 3. Span events (deprecated, supported for backward compatibility) |
| 48 | + |
| 49 | +Message content is emitted as attributes on span events. agentevals promotes these to span-level attributes during normalization so downstream processing sees a uniform shape. |
| 50 | + |
| 51 | +This promotion happens in three processing layers: |
| 52 | +- `streaming/processor.py` for live WebSocket spans |
| 53 | +- `api/otlp_routes.py` for OTLP HTTP reception |
| 54 | +- `loader/otlp.py` for loading OTLP JSON files |
| 55 | + |
| 56 | +## Span Events Deprecation |
| 57 | + |
| 58 | +The OTel community is [deprecating the Span Event API](https://opentelemetry.io/blog/2026/deprecating-span-events/) (`Span.AddEvent`, `Span.RecordException`) in favor of emitting events as log records via the Logs API. The core idea: "events are logs with names," correlated with traces through context. |
| 59 | + |
| 60 | +### What this means for agentevals users |
| 61 | + |
| 62 | +**No immediate action required.** Existing instrumentation continues to work. The deprecation is about providing a single recommended path for new code, not about removing support for existing span event data. |
| 63 | + |
| 64 | +**For new instrumentation**, prefer the logs-based pattern. Configure both `OTLPSpanExporter` and `OTLPLogExporter`, and use instrumentation libraries that emit message content as log records. |
| 65 | + |
| 66 | +**For existing span-event instrumentation** (e.g. Strands with `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental`), everything continues to work. When your framework releases a version that migrates to log-based events, update your exporter configuration to include `OTLPLogExporter` and follow the logs-based pattern. |
| 67 | + |
| 68 | +### What this means for agentevals internals |
| 69 | + |
| 70 | +agentevals already supports both content delivery mechanisms. The span event promotion logic will remain for backward compatibility with older instrumentation versions. As frameworks migrate, the log-based path (already fully supported) will become the primary path. |
| 71 | + |
| 72 | +### Migration checklist for framework authors |
| 73 | + |
| 74 | +If you maintain an OTel-instrumented agent framework and want to align with the deprecation: |
| 75 | + |
| 76 | +1. Emit `gen_ai.input.messages` and `gen_ai.output.messages` as log records instead of span events |
| 77 | +2. Correlate logs with spans via trace context (the OTel SDK handles this automatically) |
| 78 | +3. Document that users need both `OTLPSpanExporter` and `OTLPLogExporter` |
| 79 | +4. Consider an opt-in flag (similar to `OTEL_SEMCONV_EXCEPTION_SIGNAL_OPT_IN`) during the transition |
| 80 | + |
| 81 | +## OTLP Receiver |
| 82 | + |
| 83 | +agentevals runs an OTLP HTTP receiver on port 4318 (the standard OTLP HTTP port) that accepts: |
| 84 | + |
| 85 | +| Endpoint | Content Types | |
| 86 | +|----------|--------------| |
| 87 | +| `/v1/traces` | `application/json`, `application/x-protobuf` | |
| 88 | +| `/v1/logs` | `application/json`, `application/x-protobuf` | |
| 89 | + |
| 90 | +Point your standard OTel exporters at `http://localhost:4318` and traces will stream into agentevals automatically. See [examples/README.md](../examples/README.md) for zero-code setup instructions. |
0 commit comments