Skip to content

[bot] BraintrustStream cannot parse Anthropic SSE streaming format #62

@braintrust-bot

Description

@braintrust-bot

Summary

The Braintrust Rust SDK instruments Anthropic for non-streaming responses via extract_anthropic_usage() but has no support for Anthropic's streaming Messages API. Anthropic uses a completely different SSE event format (message_start / content_block_delta / message_delta) that BraintrustStream cannot parse — the stream aggregator silently drops all Anthropic streaming chunks and returns empty output with no usage metrics.

What is missing

Anthropic's streaming API emits the following SSE event types:

event: message_start
data: {"type": "message_start", "message": {"id": "msg_...", "type": "message", "role": "assistant", "content": [], "model": "claude-opus-4-8", "stop_reason": null, "usage": {"input_tokens": 25, "output_tokens": 1}}}

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}

event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}, "usage": {"output_tokens": 15}}

event: message_stop
data: {"type": "message_stop"}

BraintrustStream (in src/stream.rs) only handles OpenAI Chat Completions streaming format, which uses a choices array:

{"model": "gpt-4", "choices": [{"delta": {"content": "Hello"}, "finish_reason": null}], "usage": null}

The mismatch causes two concrete failures:

  1. Content extraction fails silently: BraintrustStream::aggregate() at src/stream.rs:852–854 calls serde_json::from_value(raw.clone()) and on Err(_) does continue — skipping the chunk. Anthropic's content_block_delta chunks parse as a StreamChunk with an empty choices array (because choices is #[serde(default)]), so no content is extracted. All generated text is lost.

  2. Usage metrics are not captured: input_tokens comes from message_start.message.usage.input_tokens — nested two levels deep. output_tokens comes from message_delta.usage.output_tokens. BraintrustStream only reads chunk.usage at the top level of a StreamChunk, which is None for all Anthropic events. The result is usage: None in the finalized stream regardless of how many tokens were consumed.

Anthropic also emits tool-use events differently: input_json_delta events with partial_json fields instead of OpenAI's tool_calls[].function.arguments fragments. These are also silently dropped.

The SDK already exposes extract_anthropic_usage() (non-streaming) and extract_openai_usage(), and wrap_stream_with_span() wraps any Stream<Item = Result<Value, E>>. But there is no way to use wrap_stream_with_span() with an Anthropic streaming response and get correct output or usage data.

Braintrust docs status

supported in other SDKs — Braintrust documents Anthropic as a fully supported provider including streaming:

The Python SDK's wrap_anthropic() and TypeScript SDK's wrapAnthropic() both handle Anthropic streaming (SSE events) correctly in those languages. The Rust SDK has no equivalent.

Upstream sources

Local files inspected

  • src/stream.rs:686–694StreamChunk struct has choices: Vec<StreamChoice> and top-level usage: Option<StreamUsage>; no fields for type, message, delta.type, delta.text
  • src/stream.rs:738–762StreamUsage aliases input_tokens/output_tokens but only at chunk.usage level, not at chunk.message.usage (message_start) or chunk.usage on message_delta
  • src/stream.rs:852–854Err(_) => continue silently skips unparseable chunks; Anthropic chunks that do partially parse as StreamChunk produce empty choices arrays so content is still lost
  • src/extractors.rs:92–188extract_anthropic_usage() correctly handles non-streaming Anthropic response format but is not wired to streaming at all
  • src/lib.rs — public API exports; no Anthropic-streaming-specific types or helpers
  • Full codebase grep for message_start, content_block_delta, text_delta, message_delta — zero results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions