refactor: update Live API configuration, transcription support, and i… by thorwebdev · Pull Request #1224 · google-gemini/cookbook

thorwebdev · 2026-05-08T20:25:08Z

…nput handling for model version gemini-3.1-flash-live-preview

gemini-code-assist · 2026-05-08T20:27:19Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly updates the Get_started_LiveAPI.py quickstart example to leverage the latest gemini-3.1-flash-live-preview model. The changes primarily focus on adapting the client configuration and real-time input/output handling to incorporate new features like audio transcription and a more streamlined API for sending media blobs. This ensures the example remains compatible with the most recent API version and demonstrates enhanced interactive capabilities.

Highlights

Live API Configuration Update: The quickstart example has been updated to use the new gemini-3.1-flash-live-preview model, replacing the older gemini-2.5-flash-native-audio-preview-12-2025 model.
Transcription Support Enabled: Input and output audio transcription capabilities have been enabled within the LiveSession configuration, allowing for real-time text representation of spoken input and model responses.
Streamlined Input Handling: The methods for sending text, audio, and visual inputs have been refactored to align with the updated API, including the removal of base64 encoding for image data and the use of types.Blob for media inputs.
API Key Environment Variable Change: The environment variable for the API key has been updated from GEMINI_API_KEY to GOOGLE_API_KEY.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the Live API quickstart to use the google-genai SDK (v1.0.0+), involving model and API key updates, enabling audio transcriptions, and refactoring the audio reception logic. The review feedback highlights several necessary corrections for SDK compatibility, such as the proper usage of send_realtime_input and send_client_content. Additionally, improvements were suggested regarding the repository's style guide for model selection, the use of constants for sample rates, and the handling of incremental transcription updates.

gemini-code-assist · 2026-05-08T20:30:22Z

-                    turns=types.Content(parts=[types.Part(text=text or "")]),
-                    turn_complete=True,
-                )
+                await self.session.send_realtime_input(text=text)


The send_realtime_input method is designed for streaming media chunks (blobs). To send text input in a multimodal live session, you should use send_client_content within a turn.

Suggested change

await self.session.send_realtime_input(text=text)

await self.session.send_client_content(

turns=[types.Content(role="user", parts=[types.Part(text=text)])],

turn_complete=True

)

You are wrong! https://ai.google.dev/gemini-api/docs/live-api/capabilities#send-text

@markmcd can we give gemini-code-assist access to the skills?

You are correct, send_realtime_input is indeed the appropriate method for sending text in the Live API when using the gemini-3.1-flash-live-preview model. My previous comment was based on an older interface pattern. Thank you for the correction.

ack :) great idea, it doesn't look like it's possible without copying it across but i'll check with the GCA folks.

gemini-code-assist · 2026-05-08T20:30:22Z

                msg = await self.out_queue.get()
+                blob = types.Blob(data=msg["data"], mime_type=msg["mime_type"])
                if msg["mime_type"].startswith("audio/"):
-                    await self.session.send_realtime_input(audio=msg)
+                    await self.session.send_realtime_input(audio=blob)
                else:
-                    await self.session.send_realtime_input(media=msg)
+                    # Use video= (not the deprecated media=) for image/video frames.
+                    await self.session.send_realtime_input(video=blob)


In the google-genai SDK (v1.0.0+), the send_realtime_input method expects a media_chunks argument containing a list of Blob objects. It does not support audio or video as direct keyword arguments.

msg = await self.out_queue.get() blob = types.Blob(data=msg["data"], mime_type=msg["mime_type"]) await self.session.send_realtime_input(media_chunks=[blob])

wrong again, media_chunks has been deprecated.

gemini-code-assist · 2026-05-08T20:30:22Z


 # --- Model Configuration ---
-MODEL = "models/gemini-2.5-flash-native-audio-preview-12-2025"
+MODEL = "gemini-3.1-flash-live-preview"


According to the repository style guide (line 72), model selection should use a Colab selector for better maintainability and to allow users to easily switch models in supported environments.

Suggested change

MODEL = "gemini-3.1-flash-live-preview"

MODEL = "gemini-3.1-flash-live-preview" # @param ["gemini-3.1-flash-live-preview", "gemini-2.0-flash-live-001"] {"allow-input":true, "isTemplate": true}

References

When selecting a model, use a colab selector for easier maintainability. ^(link)

gemini-code-assist · 2026-05-08T20:30:22Z

                payload = {
                    "data": data,
-                    "mime_type": "audio/pcm"
+                    "mime_type": "audio/pcm;rate=16000"


It is recommended to use the SEND_SAMPLE_RATE constant instead of a hardcoded value in the MIME type string. This ensures that if the sample rate is updated in the configuration, it is automatically reflected here.

Suggested change

"mime_type": "audio/pcm;rate=16000"

"mime_type": f"audio/pcm;rate={SEND_SAMPLE_RATE}"

gemini-code-assist · 2026-05-08T20:30:22Z

+                    if server_content.input_transcription:
+                        print(f"\nYou: {server_content.input_transcription.text}", end="")
+
+                    if server_content.output_transcription:
+                        print(f"\nGemini: {server_content.output_transcription.text}", end="")


The Live API sends incremental transcription updates. Printing every update results in duplicate and fragmented text in the console. It is better to check the is_final attribute to only print the completed transcription.

Suggested change

if server_content.input_transcription:

print(f"\nYou: {server_content.input_transcription.text}", end="")

if server_content.output_transcription:

print(f"\nGemini: {server_content.output_transcription.text}", end="")

if server_content.input_transcription and server_content.input_transcription.is_final:

print(f"\nYou: {server_content.input_transcription.text}")

if server_content.output_transcription and server_content.output_transcription.is_final:

print(f"\nGemini: {server_content.output_transcription.text}")

refactor: update Live API configuration, transcription support, and i…

0f47b00

…nput handling for model version gemini-3.1-flash-live-preview

thorwebdev requested a review from Giom-V May 8, 2026 20:25

github-actions Bot added status:awaiting review PR awaiting review from a maintainer component:quickstarts Issues/PR referencing quickstarts folder labels May 8, 2026

thorwebdev requested a review from kkorpal May 8, 2026 20:25

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: update Live API configuration, transcription support, and i…#1224

refactor: update Live API configuration, transcription support, and i…#1224
thorwebdev wants to merge 1 commit into
mainfrom
thor/update-live-quickstart-python

thorwebdev commented May 8, 2026

Uh oh!

gemini-code-assist Bot commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

thorwebdev May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

markmcd May 11, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

thorwebdev May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	MODEL = "gemini-3.1-flash-live-preview"
	MODEL = "gemini-3.1-flash-live-preview" # @param ["gemini-3.1-flash-live-preview", "gemini-2.0-flash-live-001"] {"allow-input":true, "isTemplate": true}

	"mime_type": "audio/pcm;rate=16000"
	"mime_type": f"audio/pcm;rate={SEND_SAMPLE_RATE}"

Conversation

thorwebdev commented May 8, 2026

Uh oh!

gemini-code-assist Bot commented May 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

thorwebdev May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

markmcd May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

thorwebdev May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants