Summary
BraintrustStream and wrap_stream_with_span only handle OpenAI Chat Completions streaming chunks (choices[].delta). AWS Bedrock's ConverseStream API emits a completely different event structure (contentBlockDelta, messageStop, metadata) with no choices field. All Bedrock streaming content and usage metrics are silently discarded: Bedrock events parse as empty StreamChunk objects (all fields default to None/[]) and no content, tool use, or token counts are captured.
This is distinct from #52 (non-streaming Bedrock Converse usage extraction), which covers the extract_*_usage gap for non-streaming responses. This issue is specifically about the BraintrustStream aggregation path for streaming responses.
What is missing
AWS Bedrock ConverseStream emits a sequence of typed events, each wrapped in a discriminant key:
// messageStart
{"messageStart": {"role": "assistant"}}
// contentBlockStart (text)
{"contentBlockStart": {"contentBlockIndex": 0, "start": {}}}
// contentBlockDelta (text)
{"contentBlockDelta": {"contentBlockIndex": 0, "delta": {"text": "Hello, I can help"}}}
// contentBlockStop
{"contentBlockStop": {"contentBlockIndex": 0}}
// contentBlockStart (tool use)
{"contentBlockStart": {"contentBlockIndex": 1, "start": {"toolUse": {"toolUseId": "tool123", "name": "get_weather"}}}}
// contentBlockDelta (tool use arguments)
{"contentBlockDelta": {"contentBlockIndex": 1, "delta": {"toolUse": {"input": "{\"location\": \"NYC\"}"}}}}
// messageStop
{"messageStop": {"stopReason": "end_turn"}}
// metadata (usage + latency)
{"metadata": {
"usage": {"inputTokens": 30, "outputTokens": 50, "totalTokens": 80},
"metrics": {"latencyMs": 1275}
}}
Key structural differences from OpenAI Chat Completions streaming format:
- No
choices field: Text content lives in contentBlockDelta.delta.text, not in choices[].delta.content
- Token usage in nested
metadata: Usage is at metadata.usage.inputTokens/outputTokens/totalTokens (camelCase), not at the root usage.prompt_tokens/completion_tokens
- Tool use in
contentBlockStart/contentBlockDelta: Tool invocations use contentBlockStart.start.toolUse and contentBlockDelta.delta.toolUse, not choices[].delta.tool_calls
- Stop reason in
messageStop: Finish reason is at messageStop.stopReason, not choices[].finish_reason
- Latency in
metadata.metrics: Server-side latency latencyMs is available — not present in any other provider's streaming format
Failure mode in current SDK
StreamChunk (src/stream.rs:687-694) is defined with all #[serde(default)] fields:
struct StreamChunk {
#[serde(default)]
model: Option<String>,
#[serde(default)]
choices: Vec<StreamChoice>,
#[serde(default)]
usage: Option<StreamUsage>,
}
Because serde ignores unknown fields by default (no #[serde(deny_unknown_fields)]), serde_json::from_value on any Bedrock ConverseStream event succeeds — but produces a StreamChunk with model: None, choices: [], and usage: None. The Err(_) => continue fallback at line 856 is never hit, but every chunk is effectively empty. This means:
- Text output from
contentBlockDelta.delta.text is lost
- Tool use name and arguments from
contentBlockStart/contentBlockDelta.delta.toolUse are lost
- Usage metrics (
inputTokens, outputTokens, totalTokens) from metadata.usage are never extracted (they're nested under metadata, not at the root usage key)
- Stop reason from
messageStop.stopReason is lost
- TTFT metric is not recorded (
value_has_content() at src/stream.rs:1117-1119 checks for non-empty choices array, which is always empty for Bedrock events)
- Server-side latency from
metadata.metrics.latencyMs is never captured
Braintrust docs status
supported — Braintrust explicitly documents AWS Bedrock tracing: "Converse, ConverseStream, and InvokeModel calls are traced." Other Braintrust SDKs (Go) already provide Bedrock Runtime middleware that captures token usage including cache tokens. The Rust SDK has no equivalent streaming support.
Upstream sources
Relationship to existing issues
Local files inspected
src/stream.rs:687-694 — StreamChunk struct has only model, choices, usage with #[serde(default)]; Bedrock events have none of these keys at root level, so all parse as empty structs
src/stream.rs:840-857 — aggregate() calls serde_json::from_value; Bedrock events silently deserialize to empty StreamChunk objects without hitting the Err(_) => continue fallback
src/stream.rs:1117-1119 — value_has_content() checks choices array; always empty for Bedrock events, so TTFT is never recorded
src/extractors.rs — extract_openai_usage() looks for root usage.prompt_tokens; extract_anthropic_usage() looks for root usage.input_tokens; neither matches Bedrock's metadata.usage.inputTokens
src/lib.rs — public API exports; no Bedrock references
- Full codebase grep for
bedrock, ConverseStream, contentBlockDelta, messageStart, inputTokens — zero results
Summary
BraintrustStreamandwrap_stream_with_spanonly handle OpenAI Chat Completions streaming chunks (choices[].delta). AWS Bedrock'sConverseStreamAPI emits a completely different event structure (contentBlockDelta,messageStop,metadata) with nochoicesfield. All Bedrock streaming content and usage metrics are silently discarded: Bedrock events parse as emptyStreamChunkobjects (all fields default toNone/[]) and no content, tool use, or token counts are captured.This is distinct from #52 (non-streaming Bedrock Converse usage extraction), which covers the
extract_*_usagegap for non-streaming responses. This issue is specifically about theBraintrustStreamaggregation path for streaming responses.What is missing
AWS Bedrock
ConverseStreamemits a sequence of typed events, each wrapped in a discriminant key:Key structural differences from OpenAI Chat Completions streaming format:
choicesfield: Text content lives incontentBlockDelta.delta.text, not inchoices[].delta.contentmetadata: Usage is atmetadata.usage.inputTokens/outputTokens/totalTokens(camelCase), not at the rootusage.prompt_tokens/completion_tokenscontentBlockStart/contentBlockDelta: Tool invocations usecontentBlockStart.start.toolUseandcontentBlockDelta.delta.toolUse, notchoices[].delta.tool_callsmessageStop: Finish reason is atmessageStop.stopReason, notchoices[].finish_reasonmetadata.metrics: Server-side latencylatencyMsis available — not present in any other provider's streaming formatFailure mode in current SDK
StreamChunk(src/stream.rs:687-694) is defined with all#[serde(default)]fields:Because serde ignores unknown fields by default (no
#[serde(deny_unknown_fields)]),serde_json::from_valueon any Bedrock ConverseStream event succeeds — but produces aStreamChunkwithmodel: None,choices: [], andusage: None. TheErr(_) => continuefallback at line 856 is never hit, but every chunk is effectively empty. This means:contentBlockDelta.delta.textis lostcontentBlockStart/contentBlockDelta.delta.toolUseare lostinputTokens,outputTokens,totalTokens) frommetadata.usageare never extracted (they're nested undermetadata, not at the rootusagekey)messageStop.stopReasonis lostvalue_has_content()atsrc/stream.rs:1117-1119checks for non-emptychoicesarray, which is always empty for Bedrock events)metadata.metrics.latencyMsis never capturedBraintrust docs status
supported — Braintrust explicitly documents AWS Bedrock tracing: "Converse, ConverseStream, and InvokeModel calls are traced." Other Braintrust SDKs (Go) already provide Bedrock Runtime middleware that captures token usage including cache tokens. The Rust SDK has no equivalent streaming support.
Upstream sources
messageStart,contentBlockStart,contentBlockDelta,contentBlockStop,messageStop,metadata): https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.htmlConverseStreamMetadataEvent(usage/metrics fields): https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStreamMetadataEvent.htmlContentBlockDeltaEvent(text delta and toolUse delta): https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ContentBlockDeltaEvent.htmlRelationship to existing issues
extract_openai_usage/extract_anthropic_usageparity for non-streaming Bedrock responses, whereusage.inputTokensis at the root level. This issue covers the streaming path (BraintrustStream,wrap_stream_with_span) where usage lives in ametadataevent and content comes fromcontentBlockDeltaevents — a completely different streaming schema.BraintrustStream) for different providers.Local files inspected
src/stream.rs:687-694—StreamChunkstruct has onlymodel,choices,usagewith#[serde(default)]; Bedrock events have none of these keys at root level, so all parse as empty structssrc/stream.rs:840-857—aggregate()callsserde_json::from_value; Bedrock events silently deserialize to emptyStreamChunkobjects without hitting theErr(_) => continuefallbacksrc/stream.rs:1117-1119—value_has_content()checkschoicesarray; always empty for Bedrock events, so TTFT is never recordedsrc/extractors.rs—extract_openai_usage()looks for rootusage.prompt_tokens;extract_anthropic_usage()looks for rootusage.input_tokens; neither matches Bedrock'smetadata.usage.inputTokenssrc/lib.rs— public API exports; no Bedrock referencesbedrock,ConverseStream,contentBlockDelta,messageStart,inputTokens— zero results