Jarvis ADK Voice Agent

A real-time voice and text AI agent built with Google's ADK (Agent Development Kit), featuring bidirectional audio streaming, Google Search integration, and PDF reading capabilities.

📖 Read the Architecture Documentation for detailed technical insights into how the voice interaction system works.

What's Inside

This repository contains a FastAPI-based web application that enables voice and text conversations with a Gemini-powered AI agent. The agent can search Google for information and read PDF files from your filesystem.

Key Components:

FastAPI WebSocket server for real-time communication
Web Audio API-based voice streaming (16kHz PCM)
Google ADK integration with Gemini 2.0 Flash
Custom tools for Google Search and PDF reading
Modern web interface with voice/text mode switching

Prerequisites

Python 3.8 or higher
Google AI Studio API key (Get one here)
Modern web browser (Chrome/Edge recommended)

Setup

Install dependencies:

python3 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

Configure environment:

cp env.template .env

Edit .env and set:

GOOGLE_API_KEY=your_api_key_here
CONTENT_FOLDER=/path/to/your/files

Run the application:
```
./run.sh
```
Open in browser:
```
http://localhost:8000
```

Usage

Text Mode: Type questions and get streaming responses
Voice Mode: Click 🎤 to enable voice interaction (requires microphone access)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
app		app
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
LICENSE		LICENSE
README.md		README.md
env.template		env.template
requirements.txt		requirements.txt
run.sh		run.sh
setup.sh		setup.sh
test_setup.py		test_setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jarvis ADK Voice Agent

What's Inside

Prerequisites

Setup

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jarvis ADK Voice Agent

What's Inside

Prerequisites

Setup

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages