Skip to content

[BidiGenerateContent] Premature turnComplete — model sends turnComplete mid-sentence, cutting off audio #1227

@Hprg

Description

@Hprg

Environment

  • Model: gemini-3.1-flash-live-preview
  • Voice: Leda
  • Transport: Raw WebSocket v1beta wss://generativelanguage.googleapis.com/ws/.../BidiGenerateContent
  • Audio pipeline: Twilio MediaStreams (mulaw 8kHz) → PCM16 16kHz → Gemini → PCM16 24kHz → mulaw 8kHz
  • Platform: Node.js (ws library, raw WebSocket, no SDK)

Bug Description

The model sends serverContent.turnComplete: true in the middle of generating a response, causing audio output to stop mid-sentence. The model was clearly not finished speaking — the sentence is grammatically incomplete, the thought is unfinished — but turnComplete fires and audio generation stops.

This is different from interruption (serverContent.interrupted) — no caller speech triggers this. The model spontaneously decides the turn is complete when it clearly isn't.

Examples from Production

  1. Inbound call (3 seconds, unknown outcome):

    • Transcript: AI Agent: Hi, thanks for calling [Company]! This
    • Model sent turnComplete mid-sentence. "This" is clearly not a complete thought.
  2. Inbound call (1 second, unknown outcome):

    • Transcript: AI Agent: Hi, thanks
    • Model cut off after 2 words of a greeting.
  3. Inbound call (6 seconds, unknown outcome):

    • Transcript: AI Agent: Hi, thanks for calling [Company]! This is how can I help you?
    • Full greeting delivered, but model sent turnComplete before processing the caller's response.

Frequency

In a 14-day window with 603 calls, we observed 11 inbound calls with outcome: unknown where the transcript shows Gemini's greeting was cut short or Gemini failed to respond after the caller spoke. Several of these show clear mid-sentence termination in the transcript.

Correlation with contextWindowCompression

Other developers have reported that contextWindowCompression: { slidingWindow: {} } correlates with increased premature turnComplete frequency. See googleapis/js-genai#707 (44 comments, Priority P2, open since June 2025). We currently have contextWindowCompression enabled and are considering removing it.

Expected Behavior

turnComplete should only fire when the model has finished generating its complete response — a grammatically complete sentence or thought. Mid-word or mid-sentence turnComplete should never occur.

Questions for the Team

  1. Is premature turnComplete a known issue on gemini-3.1-flash-live-preview?
  2. Does contextWindowCompression: { slidingWindow: {} } affect turnComplete timing?
  3. Is there a way to distinguish between legitimate turnComplete and premature termination?
  4. Is this related to the audio freeze issue ([BidiGenerateContent] Model audio output freezes mid-conversation — stops producing audio with no error #1225) or a separate bug?

Related Issues

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions