A React-based client for Google's Gemini Live API, featuring real-time audio/video streaming and a WebSocket proxy for secure authentication.
🚀 Full Guide & Architectural Breakdown: https://www.garvik.dev/ai/gemini-live-api-streaming
Install Python dependencies and start the proxy server:
# Install dependencies
pip install -r requirements.txt
# Authenticate with Google Cloud
gcloud auth application-default login
# Start the proxy server
python server.pyIn a new terminal, start the React application:
Ensure you have Node.js and npm installed. If not, download and install them from nodejs.org.
# Install Node modules
npm install
# Start development server
npm run devOpen http://localhost:5173 to view the app.
- Real-time Streaming: Audio and video streaming to Gemini.
- React Components: Modular UI with
LiveAPIDemo.jsx. - Secure Proxy: Python backend handles Google Cloud authentication.
- Custom Tools: Support for defining client-side tools.
- Media Handling: dedicated audio capture and playback processors.
/
|
|--server-api # WebSocket proxy & auth handler
|
|--web # React application
Located in src/utils/gemini-api.js, this class manages the WebSocket connection.
import { GeminiLiveAPI } from "./utils/gemini-api";
const client = new GeminiLiveAPI(
"ws://localhost:8080",
"your-project-id",
"gemini-2.0-flash-exp"
);
client.connect();
client.sendText("Hello Gemini");The app uses AudioWorklets for low-latency audio processing:
capture.worklet.js: Handles microphone input.playback.worklet.js: Handles PCM audio output.
- Model: Defaults to
gemini-live-2.5-flash-native-audio - Voice: Configurable in
LiveAPIDemo.jsx(Puck, Charon, etc.) - Proxy Port: Default
8080(set inserver-api/server.py) --env variables Create a .env file in the root directory and add the following variables: VITE_PROXY_URL=web-socket proxy url VITE_PROJECT_ID=your project id GCS_BUCKET_NAME=your gcs bucket name
Create a new Virtual Room and join the room using the room id.