Skip to content

[bot] BraintrustStream cannot aggregate Google Gemini streamGenerateContent events #60

@braintrust-bot

Description

@braintrust-bot

Summary

BraintrustStream and wrap_stream_with_span only handle OpenAI Chat Completions streaming chunks (choices[].delta). Google Gemini's streamGenerateContent endpoint emits GenerateContentResponse objects with a completely different structure (candidates[].content.parts[]). All Gemini streaming chunks are silently discarded by the Err(_) => continue fallback in aggregate(), producing an empty aggregated result with no output, no usage metrics, and no TTFT.

This is distinct from #34 (non-streaming Gemini usageMetadata extraction), which covers the non-streaming case. This issue is specifically about the streaming aggregation path.

What is missing

Gemini's streamGenerateContent yields a sequence of GenerateContentResponse JSON objects. Each chunk has this structure:

{
  "candidates": [{
    "content": {
      "parts": [{ "text": "Hello" }],
      "role": "model"
    },
    "finishReason": "STOP"
  }]
}

The final chunk includes usageMetadata:

{
  "candidates": [...],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 25,
    "totalTokenCount": 35,
    "thoughtsTokenCount": 12,
    "cachedContentTokenCount": 5
  }
}

For tool/function calling, parts contain a functionCall object instead of text:

{
  "candidates": [{
    "content": {
      "parts": [{ "functionCall": { "name": "get_weather", "args": { "location": "NYC" } } }],
      "role": "model"
    }
  }]
}

For Gemini 2.5 thinking models, parts include a thought field:

{ "parts": [{ "text": "Let me think...", "thought": true }] }

Currently in this SDK, BraintrustStream::aggregate() (src/stream.rs) attempts to deserialize each raw chunk as StreamChunk { model, choices, usage }. Gemini chunks have no choices field, so serde_json::from_value fails and the chunk is skipped. This means:

  • Text output from candidates[].content.parts[].text is lost
  • Function call output from candidates[].content.parts[].functionCall is lost
  • Thinking content from Gemini 2.5 thought parts is lost
  • Usage metrics (promptTokenCount, candidatesTokenCount, thoughtsTokenCount, cachedContentTokenCount) are never extracted
  • Finish reason from candidates[].finishReason is lost
  • TTFT metric is not recorded (the content heuristic value_has_content() checks for choices which Gemini chunks don't have)

Braintrust docs status

supported (in other language SDKs) — Braintrust documents full Gemini streaming support including token metrics, thinking model support, and function call tracing:

Status for the Rust SDK: not instrumented

Upstream sources

Relationship to existing issues

Local files inspected

  • src/stream.rsStreamChunk struct only has model, choices, usage; Gemini chunks have candidates not choices; aggregate() skips all non-parseable chunks via Err(_) => continue; value_has_content() checks for choices array which Gemini chunks lack
  • src/extractors.rsextract_openai_usage() and extract_anthropic_usage() both call value.get("usage"); Gemini uses usageMetadata so neither can extract Gemini stream usage
  • src/lib.rswrap_stream_with_span is the primary streaming instrumentation surface; no Gemini-specific path
  • Cargo.toml — no Google AI / Vertex AI dependencies
  • Full codebase grep for gemini, google, genai, usageMetadata, candidatesTokenCount, candidates — zero results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions