-
Notifications
You must be signed in to change notification settings - Fork 838
Description
Which component is this feature for?
OpenAI Instrumentation
🔖 Feature description
Add event emission support to the OpenAI Agents instrumentation following OpenTelemetry GenAI semantic conventions. This enables the instrumentation to emit structured events (gen_ai.user.message, gen_ai.assistant.message, gen_ai.choice, gen_ai.tool.start, gen_ai.tool.end) instead of (or in addition to) storing prompts and completions as span attributes.
The implementation should:
- Emit events that comply with OpenTelemetry GenAI semantic conventions
- Support both legacy mode (span attributes) and event mode via a configuration flag
- Respect the TRACELOOP_TRACE_CONTENT setting for content redaction
- Handle all agent interaction types: user messages, assistant responses, choices, and tool calls
- Maintain backward compatibility with existing implementations
🎤 Why is this feature needed ?
-
Semantic Convention Compliance: The OpenTelemetry GenAI semantic conventions recommend using events for capturing prompts, completions, and tool interactions. This provides a more structured and standardized way to observe LLM interactions.
-
Consistency Across Instrumentations: Other instrumentations in the OpenLLMetry project (OpenAI, LangChain, LlamaIndex, etc.) already support event emission. Adding this to OpenAI Agents instrumentation ensures feature parity and consistent observability patterns across the ecosystem.
-
Better Observability: Events provide a more granular view of agent interactions, allowing for better analysis of:
- Message flow between user and assistant
- Tool usage patterns
- Choice generation and reasoning steps
- Agent decision-making processes
-
Future-Proofing: As the OpenTelemetry specification evolves, event-based tracing is becoming the recommended approach. Supporting both modes allows users to migrate gradually while maintaining backward compatibility.
-
Enhanced Tooling Support: Event-based data structures are better suited for modern observability tools and dashboards that can parse and visualize structured event data more effectively than span attributes.
✌️ How do you aim to achieve this?
-
Create Event Models: Define dataclass models for different event types:
MessageEvent: For user/assistant/system messagesChoiceEvent: For model completion choicesToolStartEventandToolEndEvent: For tool execution lifecycle
-
Implement Event Emitter: Create an
event_emitter.pymodule that:- Emits events following the semantic convention naming (
gen_ai.{role}.message,gen_ai.choice, etc.) - Handles content redaction based on
TRACELOOP_TRACE_CONTENTsetting - Properly formats tool calls and function arguments
- Removes redundant attributes per semantic convention rules
- Emits events following the semantic convention naming (
-
Add Configuration Support:
- Add
use_legacy_attributesflag to the instrumentor (default:Truefor backward compatibility) - Add
event_loggerto the Config class - Initialize event logger when
use_legacy_attributes=False
- Add
-
Integrate with Hooks: Modify the
_hooks.pyfile to:- Check
should_emit_events()utility function - Emit appropriate events at key interaction points (message creation, choice generation, tool execution)
- Maintain existing span attribute behavior when in legacy mode
- Check
-
Add Utility Functions: Extend
utils.pywith:should_emit_events(): Checks if events should be emitted based on config- Ensure proper event logger validation
-
Comprehensive Testing: Add test suite covering:
- Legacy mode (span attributes only, no events)
- Event mode with content enabled
- Event mode with content disabled
- Tool call event emission
- Proper event structure validation
🔄️ Additional Information
No response
👀 Have you spent some time to check if this feature request has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
Yes I am willing to submit a PR!