[bot] BraintrustStream cannot aggregate Cohere v2 chat streaming events

## Summary

`BraintrustStream` and `wrap_stream_with_span` only handle OpenAI Chat Completions streaming chunks (`choices[].delta`). Cohere's v2 Chat API streaming uses a completely different SSE event format (`content-delta`, `message-end`, etc.) with no `choices` field. All Cohere streaming text and usage metrics are silently discarded: each Cohere event parses as an empty `StreamChunk` (all fields default to `None`/`[]`) and no content, tool calls, or token counts are captured.

This is distinct from #49 (non-streaming Cohere usage extraction), which covers the `extract_*_usage` gap for non-streaming responses. This issue is specifically about the `BraintrustStream` aggregation path for streaming responses.

## What is missing

Cohere v2 Chat API streaming (endpoint: `POST /v2/chat`, `stream: true`) emits SSE events identified by a `type` field:

```json
// message-start
{"type": "message-start", "id": "abc123", "delta": {"message": {"role": "assistant", "content": []}}}

// content-start
{"type": "content-start", "index": 0, "delta": {"message": {"content": {"type": "text", "text": ""}}}}

// content-delta (one per token)
{"type": "content-delta", "index": 0, "delta": {"message": {"content": {"type": "text", "text": "Hello"}}}}

// content-end
{"type": "content-end", "index": 0}

// message-end (contains usage)
{"type": "message-end", "delta": {
  "finish_reason": "COMPLETE",
  "usage": {
    "billed_units": {"input_tokens": 5, "output_tokens": 26},
    "tokens": {"input_tokens": 71, "output_tokens": 26}
  }
}}
```

For tool use, Cohere emits additional event types:

```json
// tool-call-start
{"type": "tool-call-start", "index": 1, "delta": {"message": {"tool_calls": {"id": "tool123", "type": "function", "function": {"name": "get_weather", "arguments": ""}}}}}

// tool-call-delta
{"type": "tool-call-delta", "index": 1, "delta": {"message": {"tool_calls": {"function": {"arguments": "{\"location\": \"NYC\"}"}}}}}

// tool-call-end
{"type": "tool-call-end", "index": 1}
```

Key structural differences from OpenAI Chat Completions streaming:

1. **No `choices` field**: Text content is at `delta.message.content.text`, not `choices[].delta.content`
2. **Usage in `message-end.delta.usage`**: Token counts are nested inside `message-end.delta.usage.billed_units` and `message-end.delta.usage.tokens`, not at a root `usage` field
3. **Dual usage objects**: `billed_units` (billable tokens) differs from `tokens` (actual processed tokens, including internal overhead) — a distinction not present in OpenAI or Anthropic formats
4. **Tool calls in `tool-call-start/delta/end`**: Tool arguments streamed via `delta.message.tool_calls.function.arguments`, structurally different from `choices[].delta.tool_calls`
5. **`type` discriminant key**: Events are typed via a top-level `type` field, not wrapped in a discriminant object like Bedrock ConverseStream

### Failure mode in current SDK

`StreamChunk` (`src/stream.rs:687-694`) is defined with all `#[serde(default)]` fields:

```rust
struct StreamChunk {
    #[serde(default)]
    model: Option<String>,
    #[serde(default)]
    choices: Vec<StreamChoice>,
    #[serde(default)]
    usage: Option<StreamUsage>,
}
```

Because serde ignores unknown fields by default, `serde_json::from_value` on any Cohere v2 streaming event **succeeds** — producing a `StreamChunk` with `model: None`, `choices: []`, and `usage: None`. The `Err(_) => continue` fallback at line 856 is never hit. Every chunk processes without error but all content and metrics are silently dropped:

- **Text output** from `content-delta.delta.message.content.text` is lost (no `choices` field)
- **Tool call arguments** from `tool-call-delta.delta.message.tool_calls.function.arguments` are lost
- **Usage metrics** from `message-end.delta.usage` are lost (nested under `delta`, not at root `usage`)
- **Billed vs. actual token distinction** (`billed_units` vs. `tokens`) is never captured
- **Finish reason** from `message-end.delta.finish_reason` is lost
- **TTFT metric** is not recorded (`value_has_content()` at `src/stream.rs:1117-1119` checks for non-empty `choices`, always empty for Cohere events)

## Braintrust docs status

**supported** — Braintrust documents Cohere instrumentation including streaming behavior: "instruments the native Cohere Python SDK so you can inspect prompts, responses, streaming behavior, embeddings, and rerank calls in Braintrust." Other Braintrust SDKs (Python: `wrap_cohere()`, TypeScript: `wrapCohere()`) handle Cohere streaming correctly in those languages. The Rust SDK has no equivalent.

- https://www.braintrust.dev/docs/integrations/ai-providers/cohere — "streaming behavior" explicitly listed as instrumented capability
- https://www.braintrust.dev/docs/instrument/trace-llm-calls — lists Cohere as a supported provider

## Upstream sources

- Cohere v2 Chat Streaming API reference (`content-delta`, `message-end`, tool-call events): https://docs.cohere.com/v2/reference/chat-stream
- Cohere v2 Streaming guide (event types and JSON structure): https://docs.cohere.com/v2/docs/streaming
- Cohere tool-use streaming (tool-call-start/delta/end events): https://docs.cohere.com/docs/tool-use-streaming

## Relationship to existing issues

- **Distinct from #49** (Cohere usage extraction): #49 covers adding `extract_cohere_usage()` for non-streaming Cohere responses where usage is in `response.usage.billed_units`. This issue covers the streaming path (`BraintrustStream`, `wrap_stream_with_span`) where usage is in `message-end.delta.usage.billed_units` and text content comes from `content-delta` events — a completely different streaming schema.
- **Analogous to #64** (Bedrock ConverseStream streaming), **#60** (Gemini streaming), and **#38/#62** (Anthropic streaming): Those issues cover the same class of failure (provider streaming format incompatible with `BraintrustStream`) for different providers.

## Local files inspected

- `src/stream.rs:687-694` — `StreamChunk` struct has only `model`, `choices`, `usage` with `#[serde(default)]`; Cohere events have a `type` field and nested `delta` structure, none of which match
- `src/stream.rs:840-857` — `aggregate()` calls `serde_json::from_value`; Cohere events silently deserialize to empty `StreamChunk` objects without hitting the `Err(_) => continue` fallback
- `src/stream.rs:1117-1119` — `value_has_content()` checks `choices` array; always empty for Cohere events, so TTFT is never recorded
- `src/extractors.rs` — `extract_openai_usage()` looks for root `usage.prompt_tokens`; `extract_anthropic_usage()` looks for root `usage.input_tokens`; neither handles Cohere's `message-end.delta.usage.billed_units` / `tokens`
- `src/lib.rs` — public API exports; no Cohere references
- Full codebase grep for `cohere`, `billed_units`, `content-delta`, `message-end`, `tool-call-start` — zero results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bot] BraintrustStream cannot aggregate Cohere v2 chat streaming events #65

Summary

What is missing

Failure mode in current SDK

Braintrust docs status

Upstream sources

Relationship to existing issues

Local files inspected

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bot] BraintrustStream cannot aggregate Cohere v2 chat streaming events #65

Description

Summary

What is missing

Failure mode in current SDK

Braintrust docs status

Upstream sources

Relationship to existing issues

Local files inspected

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions