Skip to content

Nagavenkatasai7/learning-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning Agent

A voice-enabled AI learning agent that searches multiple sources (YouTube, Reddit, Stack Overflow, Wikipedia, Web) to provide accurate, sourced answers. Built with Next.js frontend, FastAPI backend, and Claude as the LLM brain.

Features

  • Zero Hallucinations: Every answer is grounded in retrieved sources
  • Source Attribution: Always cites where information came from
  • Conversational Memory: Remembers context within and across sessions
  • Voice-First: Full hands-free interaction with browser-native TTS/STT
  • Multi-Source Search: YouTube, Reddit, Stack Overflow, Wikipedia, Google

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        FRONTEND (Next.js)                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │ Voice Input │  │ Chat UI     │  │ Source Cards            │ │
│  │ (Web Speech │  │ (Messages,  │  │ (YouTube, Reddit,       │ │
│  │  API STT)   │  │  Streaming) │  │  StackOverflow links)   │ │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                      BACKEND (FastAPI)                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Agent Orchestrator                       ││
│  │  - Query Analysis → Parallel Source Fetching → RAG Response ││
│  └─────────────────────────────────────────────────────────────┘│
│                              │                                   │
│  ┌───────────┬───────────┬───────────┬───────────┬───────────┐ │
│  │ YouTube   │ Reddit    │ Google    │ Wikipedia │StackOver- │ │
│  │ (Apify)   │ (Apify)   │ (Apify)   │ (Free)    │flow (Free)│ │
│  └───────────┴───────────┴───────────┴───────────┴───────────┘ │
│                              │                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │      Memory: SQLite (history) + ChromaDB (vectors)          ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Prerequisites

You'll need:

  1. Anthropic API Key - Get one at console.anthropic.com
  2. Apify API Token - Get one at console.apify.com

Quick Start

Option 1: Docker (Recommended)

  1. Clone the repository and navigate to it:

    cd learning-agent
  2. Copy the environment file and add your API keys:

    cp .env.example .env
    # Edit .env and add your ANTHROPIC_API_KEY and APIFY_API_TOKEN
  3. Start with Docker Compose:

    docker-compose up --build
  4. Open http://localhost:3000 in your browser

Option 2: Manual Setup

Backend Setup

  1. Create a virtual environment:

    cd backend
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  2. Install dependencies:

    pip install -r requirements.txt
  3. Create a .env file in the backend directory:

    cp ../.env.example .env
    # Edit .env and add your API keys
  4. Start the backend:

    uvicorn main:app --reload --port 8000

Frontend Setup

  1. Install dependencies:

    cd frontend
    npm install
  2. Create a .env.local file:

    echo "NEXT_PUBLIC_API_URL=http://localhost:8000" > .env.local
  3. Start the frontend:

    npm run dev
  4. Open http://localhost:3000 in your browser

Usage

Text Input

Type your question in the chat input and press Enter or click the send button.

Voice Input

Click the microphone button to start voice input. Speak your question and it will be transcribed automatically.

Voice Output

Enable "Auto-speak" to have responses read aloud automatically. You can also click the speaker icon on individual messages to have them read.

Example Questions

  • "How do React hooks work?"
  • "What is machine learning?"
  • "Best practices for Python"
  • "Explain async/await in JavaScript"

API Endpoints

Endpoint Method Description
/chat POST Send message, get response with sources
/chat/stream POST Streaming response (SSE)
/conversations GET List all conversations
/conversations/{id} GET Get conversation history
/conversations/{id} DELETE Delete conversation
/health GET Health check

Request Example

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "How do I use React hooks?"}'

Response Example

{
  "response": "Based on my research, React hooks allow you to use state and other React features in functional components...",
  "sources": [
    {
      "type": "youtube",
      "title": "React Hooks Tutorial",
      "url": "https://youtube.com/...",
      "snippet": "The useState hook lets you..."
    },
    {
      "type": "stackoverflow",
      "title": "Best practices for useEffect",
      "url": "https://stackoverflow.com/...",
      "snippet": "Always include dependencies..."
    }
  ],
  "conversation_id": "uuid-here",
  "message_id": "uuid-here"
}

Configuration

See .env.example for all available configuration options:

Variable Description Default
ANTHROPIC_API_KEY Anthropic API key Required
APIFY_API_TOKEN Apify API token Required
CLAUDE_MODEL Claude model to use claude-sonnet-4-20250514
MAX_SOURCES_PER_TYPE Max sources per type 5
MAX_TOTAL_SOURCES Max total sources 15

Project Structure

learning-agent/
├── frontend/                    # Next.js application
│   ├── app/                    # App router pages
│   ├── components/             # React components
│   ├── hooks/                  # Custom React hooks
│   └── lib/                    # API client
│
├── backend/                     # FastAPI application
│   ├── agent/                  # Agent orchestrator
│   ├── scrapers/               # Source scrapers
│   ├── memory/                 # SQLite + ChromaDB
│   └── models/                 # Pydantic schemas
│
├── docker-compose.yml          # Docker setup
├── .env.example                # Environment template
└── README.md                   # This file

Troubleshooting

"Voice input not working"

  • Ensure you're using a supported browser (Chrome, Edge, Safari)
  • Grant microphone permissions when prompted
  • Check if another application is using the microphone

"API connection failed"

  • Verify the backend is running on port 8000
  • Check that CORS is properly configured
  • Ensure your API keys are correct

"No sources found"

  • Some queries may not return results from all sources
  • Try rephrasing your question
  • Check your Apify API token quota

License

MIT License - see LICENSE file for details.

About

Voice-enabled AI learning agent with multi-source search (YouTube, Reddit, Stack Overflow, Wikipedia)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors