This project implements a VOICE RAG Agent powered by Cartesia
Ensure you have Python 3.11 or later installed and run:
pip install -r requirements.txtThis implementation uses OpenAI's services for speech-to-text and cartesia for speech synthesis, simpler setup if you already have OpenAI API keys.
- Cartesia AI key
- OpenAI API key
- LiveKit credentials
- Copy
.env.exampleto.env - Configure the following environment variables:
OPENAI_API_KEY=your_openai_api_key
CARTESIA_API_KEY=your_cartesia_api_key
LIVEKIT_URL=your_livekit_url
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secretpython voice_agent_openai.py startThis implementation uses AssemblyAI for speech processing and Ollama (with Gemma) for language tasks.
-
Install Ollama
# For macOS brew install ollama # For Linux curl -fsSL https://ollama.com/install.sh | sh
-
Pull Gemma Model
ollama pull gemma3
-
Configure Environment Copy
.env.exampleto.envand set:CARTESIA_API_KEY=your_cartesia_api_key ASSEMBLYAI_API_KEY=your_assemblyai_api_key LIVEKIT_URL=your_livekit_url LIVEKIT_API_KEY=your_livekit_api_key LIVEKIT_API_SECRET=your_livekit_api_secret
-
Start Ollama server:
ollama serve
-
In a new terminal, run the voice agent:
python voice_agent.py start
Get a FREE Data Science eBook 📖 with 150+ essential lessons in Data Science when you subscribe to our newsletter! Stay in the loop with the latest tutorials, insights, and exclusive resources. Subscribe now!
Contributions are welcome! Please fork the repository and submit a pull request with your improvements.
