This is a LiveKit Agents starter project - a full-stack voice AI application. The project consists of:
- Frontend: A Next.js React application providing the user interface
- Backend: A Node.js TypeScript agent using LiveKit Agents framework
- Communication: Real-time voice/video/chat via LiveKit's WebRTC infrastructure
Location: backend/ directory
Main File: backend/src/agent.ts
-
Voice AI Pipeline:
- STT: AssemblyAI universal streaming for speech-to-text
- LLM: OpenAI GPT-4.1-mini for conversation logic
- TTS: Cartesia Sonic-3 for text-to-speech synthesis
- Voice: Pre-configured voice ID for consistent agent voice
-
Agent Personality:
- Helpful voice AI assistant with concise, friendly responses
- Configurable instructions in the
Assistantclass - Support for custom tools (weather tool example commented in code)
-
Real-time Features:
- Turn Detection: LiveKit's multilingual model for speaker detection
- VAD: Silero Voice Activity Detection for speech segmentation
- Noise Cancellation: LiveKit Cloud background voice cancellation
- Preemptive Generation: LLM can start responding while user is still speaking
-
Metrics & Monitoring:
- Usage collection and logging
- Performance metrics tracking
- Environment variables:
LIVEKIT_URL,LIVEKIT_API_KEY,LIVEKIT_API_SECRET - Models and voices configurable in the
AgentSessionsetup - Optional realtime model support (OpenAI Realtime API alternative)
Location: frontend/ directory
Main Entry: frontend/app/(app)/page.tsx
-
UI Framework:
- Next.js 15 with App Router
- React 19 with TypeScript
- Tailwind CSS for styling
- Motion library for animations
-
LiveKit Integration:
@livekit/components-reactfor UI componentslivekit-clientfor WebRTC communicationlivekit-server-sdkfor token generation
-
Core Views:
- Welcome View: Initial screen with start button
- Session View: Active call interface with video tiles and controls
- Chat Transcript: Message history with fade effects
-
Features:
- Voice interaction with agent
- Video streaming (camera/screen share)
- Text chat input
- Audio visualization and controls
- Theme switching (light/dark)
- Configurable branding and UI text
- Token Generation: Frontend calls
/api/connection-detailsto get LiveKit tokens - Room Creation: Unique room names generated for each session
- Agent Dispatch: Optional agent name configuration for specific agent routing
- WebRTC Connection: Frontend and backend join the same LiveKit room
User Interaction → Frontend UI → LiveKit Room → Backend Agent → AI Pipeline → Response → User
↓ ↓ ↓ ↓ ↓ ↓
Voice/Text Token API WebRTC STT→LLM→TTS Synthesis Audio/Text
- Frontend:
app-config.tscontrols branding, features, and UI text - Backend: Agent instructions and model configurations in
agent.ts - Environment: Separate
.env.localfiles for frontend and backend credentials
- Frontend:
pnpm dev(port 3000) - Backend:
pnpm run dev(requires model downloads first) - Dependencies: pnpm for package management in both parts
This is a production-ready starter that can be deployed to LiveKit Cloud or self-hosted environments, with comprehensive documentation and examples for extending functionality.