A voice-enabled AI learning agent that searches multiple sources (YouTube, Reddit, Stack Overflow, Wikipedia, Web) to provide accurate, sourced answers. Built with Next.js frontend, FastAPI backend, and Claude as the LLM brain.
- Zero Hallucinations: Every answer is grounded in retrieved sources
- Source Attribution: Always cites where information came from
- Conversational Memory: Remembers context within and across sessions
- Voice-First: Full hands-free interaction with browser-native TTS/STT
- Multi-Source Search: YouTube, Reddit, Stack Overflow, Wikipedia, Google
┌─────────────────────────────────────────────────────────────────┐
│ FRONTEND (Next.js) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Voice Input │ │ Chat UI │ │ Source Cards │ │
│ │ (Web Speech │ │ (Messages, │ │ (YouTube, Reddit, │ │
│ │ API STT) │ │ Streaming) │ │ StackOverflow links) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ BACKEND (FastAPI) │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Agent Orchestrator ││
│ │ - Query Analysis → Parallel Source Fetching → RAG Response ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ┌───────────┬───────────┬───────────┬───────────┬───────────┐ │
│ │ YouTube │ Reddit │ Google │ Wikipedia │StackOver- │ │
│ │ (Apify) │ (Apify) │ (Apify) │ (Free) │flow (Free)│ │
│ └───────────┴───────────┴───────────┴───────────┴───────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Memory: SQLite (history) + ChromaDB (vectors) ││
│ └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
You'll need:
- Anthropic API Key - Get one at console.anthropic.com
- Apify API Token - Get one at console.apify.com
-
Clone the repository and navigate to it:
cd learning-agent -
Copy the environment file and add your API keys:
cp .env.example .env # Edit .env and add your ANTHROPIC_API_KEY and APIFY_API_TOKEN -
Start with Docker Compose:
docker-compose up --build
-
Open http://localhost:3000 in your browser
-
Create a virtual environment:
cd backend python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.envfile in the backend directory:cp ../.env.example .env # Edit .env and add your API keys -
Start the backend:
uvicorn main:app --reload --port 8000
-
Install dependencies:
cd frontend npm install -
Create a
.env.localfile:echo "NEXT_PUBLIC_API_URL=http://localhost:8000" > .env.local
-
Start the frontend:
npm run dev
-
Open http://localhost:3000 in your browser
Type your question in the chat input and press Enter or click the send button.
Click the microphone button to start voice input. Speak your question and it will be transcribed automatically.
Enable "Auto-speak" to have responses read aloud automatically. You can also click the speaker icon on individual messages to have them read.
- "How do React hooks work?"
- "What is machine learning?"
- "Best practices for Python"
- "Explain async/await in JavaScript"
| Endpoint | Method | Description |
|---|---|---|
/chat |
POST | Send message, get response with sources |
/chat/stream |
POST | Streaming response (SSE) |
/conversations |
GET | List all conversations |
/conversations/{id} |
GET | Get conversation history |
/conversations/{id} |
DELETE | Delete conversation |
/health |
GET | Health check |
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{"message": "How do I use React hooks?"}'{
"response": "Based on my research, React hooks allow you to use state and other React features in functional components...",
"sources": [
{
"type": "youtube",
"title": "React Hooks Tutorial",
"url": "https://youtube.com/...",
"snippet": "The useState hook lets you..."
},
{
"type": "stackoverflow",
"title": "Best practices for useEffect",
"url": "https://stackoverflow.com/...",
"snippet": "Always include dependencies..."
}
],
"conversation_id": "uuid-here",
"message_id": "uuid-here"
}See .env.example for all available configuration options:
| Variable | Description | Default |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic API key | Required |
APIFY_API_TOKEN |
Apify API token | Required |
CLAUDE_MODEL |
Claude model to use | claude-sonnet-4-20250514 |
MAX_SOURCES_PER_TYPE |
Max sources per type | 5 |
MAX_TOTAL_SOURCES |
Max total sources | 15 |
learning-agent/
├── frontend/ # Next.js application
│ ├── app/ # App router pages
│ ├── components/ # React components
│ ├── hooks/ # Custom React hooks
│ └── lib/ # API client
│
├── backend/ # FastAPI application
│ ├── agent/ # Agent orchestrator
│ ├── scrapers/ # Source scrapers
│ ├── memory/ # SQLite + ChromaDB
│ └── models/ # Pydantic schemas
│
├── docker-compose.yml # Docker setup
├── .env.example # Environment template
└── README.md # This file
- Ensure you're using a supported browser (Chrome, Edge, Safari)
- Grant microphone permissions when prompted
- Check if another application is using the microphone
- Verify the backend is running on port 8000
- Check that CORS is properly configured
- Ensure your API keys are correct
- Some queries may not return results from all sources
- Try rephrasing your question
- Check your Apify API token quota
MIT License - see LICENSE file for details.