-
Notifications
You must be signed in to change notification settings - Fork 172
Description
Description
When using openinference-instrumentation-openai-agents (v1.4.0) with Langfuse, streaming responses from agents do not capture output data, model name, or token usage. Non-streaming/short responses work correctly.
Environment
openinference-instrumentation-openai-agents: 1.4.0opentelemetry-sdk: 1.39.1- Langfuse (via OpenTelemetry exporter)
- Python 3.13
- OpenAI Agents SDK with
Runner.run_streamed()
Steps to Reproduce
- Create an agent that produces a long streaming text response:
from agents import Agent, Runner
agent = Agent(
name="MyAgent",
instructions="Write a detailed story...",
model="gpt-4o",
)
result = Runner.run_streamed(agent, input_items, context=context)
async for event in result.stream_events():
# Process streaming events
pass- Check the resulting GENERATION observation in Langfuse
Expected Behavior
The GENERATION observation should contain:
model: The model name (e.g., "gpt-4o")output: The complete response contentusage: Token counts (input, output, total)calculatedTotalCost: Computed cost
Actual Behavior
For streaming responses (long text output, ~7 seconds):
Name: response
Type: GENERATION
Model: None
Tokens: input=0, output=0, total=0
Cost: 0
Has Output: False
modelParameters: {}
usageDetails: {}
For non-streaming/short responses (structured output, ~1 second):
Name: response
Type: GENERATION
Model: gpt-5.1-2025-11-13
Tokens: input=623, output=27, total=650
Cost: 0.00104875
Has Output: True
modelParameters: {full data}
usageDetails: {detailed usage}
Analysis
The key difference is response duration/streaming behavior:
- Working: Agent with
output_type=(structured output) that completes quickly (~941ms) - Broken: Agent streaming long text narrative (~6919ms)
Both agents use identical code patterns (Runner.run_streamed() + async iteration over stream_events()), but the instrumentation only captures data for the fast/structured completions.
Related Issues
- [bug] Agno OpenInference instrumentation: output.value missing in streaming mode ⇒ "undefined" in Langfuse #2481: "[bug] Agno OpenInference instrumentation: output.value missing in streaming mode"
- [BUG] Anthropic streaming wrapper returns wrong object, breaking get_final_message() #2467: "[BUG] Anthropic streaming wrapper returns wrong object"
This appears to be part of a broader pattern where streaming mode causes telemetry data loss across multiple integrations.
Workaround
Currently investigating manual capture of usage data from result.raw_responses after streaming completes.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status