[BidiGenerateContent] Premature turnComplete — model sends turnComplete mid-sentence, cutting off audio

## Environment
- **Model:** `gemini-3.1-flash-live-preview`
- **Voice:** `Leda`
- **Transport:** Raw WebSocket v1beta `wss://generativelanguage.googleapis.com/ws/.../BidiGenerateContent`
- **Audio pipeline:** Twilio MediaStreams (mulaw 8kHz) → PCM16 16kHz → Gemini → PCM16 24kHz → mulaw 8kHz
- **Platform:** Node.js (`ws` library, raw WebSocket, no SDK)

## Bug Description

The model sends `serverContent.turnComplete: true` in the middle of generating a response, causing audio output to stop mid-sentence. The model was clearly not finished speaking — the sentence is grammatically incomplete, the thought is unfinished — but `turnComplete` fires and audio generation stops.

This is different from interruption (`serverContent.interrupted`) — no caller speech triggers this. The model spontaneously decides the turn is complete when it clearly isn't.

## Examples from Production

1. **Inbound call (3 seconds, unknown outcome):**
   - Transcript: `AI Agent: Hi, thanks for calling [Company]! This`
   - Model sent `turnComplete` mid-sentence. "This" is clearly not a complete thought.

2. **Inbound call (1 second, unknown outcome):**
   - Transcript: `AI Agent: Hi, thanks`
   - Model cut off after 2 words of a greeting.

3. **Inbound call (6 seconds, unknown outcome):**
   - Transcript: `AI Agent: Hi, thanks for calling [Company]! This is how can I help you?`
   - Full greeting delivered, but model sent `turnComplete` before processing the caller's response.

## Frequency

In a 14-day window with 603 calls, we observed 11 inbound calls with `outcome: unknown` where the transcript shows Gemini's greeting was cut short or Gemini failed to respond after the caller spoke. Several of these show clear mid-sentence termination in the transcript.

## Correlation with contextWindowCompression

Other developers have reported that `contextWindowCompression: { slidingWindow: {} }` correlates with increased premature `turnComplete` frequency. See googleapis/js-genai#707 (44 comments, Priority P2, open since June 2025). We currently have `contextWindowCompression` enabled and are considering removing it.

## Expected Behavior

`turnComplete` should only fire when the model has finished generating its complete response — a grammatically complete sentence or thought. Mid-word or mid-sentence `turnComplete` should never occur.

## Questions for the Team

1. Is premature `turnComplete` a known issue on `gemini-3.1-flash-live-preview`?
2. Does `contextWindowCompression: { slidingWindow: {} }` affect `turnComplete` timing?
3. Is there a way to distinguish between legitimate `turnComplete` and premature termination?
4. Is this related to the audio freeze issue (google-gemini/cookbook#1225) or a separate bug?

## Related Issues
- googleapis/js-genai#707 (premature turnComplete — 44 comments, P2, unresolved)
- googleapis/python-genai#2117 (same issue in Python SDK)
- google-gemini/cookbook#1225 (audio output freeze)
- google-gemini/cookbook#1197 (our previous report — 13 issues)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BidiGenerateContent] Premature turnComplete — model sends turnComplete mid-sentence, cutting off audio #1227

Environment

Bug Description

Examples from Production

Frequency

Correlation with contextWindowCompression

Expected Behavior

Questions for the Team

Related Issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BidiGenerateContent] Premature turnComplete — model sends turnComplete mid-sentence, cutting off audio #1227

Description

Environment

Bug Description

Examples from Production

Frequency

Correlation with contextWindowCompression

Expected Behavior

Questions for the Team

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions