Skip to content

Latest commit

 

History

History
116 lines (71 loc) · 2.16 KB

File metadata and controls

116 lines (71 loc) · 2.16 KB

RAG Document Question Answering API

Overview

This project implements a Retrieval-Augmented Generation (RAG) system using FastAPI.

Users can:

  • Upload PDF documents
  • Ask questions grounded strictly in document content
  • Receive context-aware answers using semantic search + LLM

The system performs:

  1. PDF text extraction
  2. Text chunking
  3. Embedding generation (Sentence Transformers)
  4. Vector storage using ChromaDB
  5. Similarity retrieval
  6. Context-grounded response generation using Groq LLM

The entire system is containerized using Docker for reproducible deployment.

Architecture

User → FastAPI → Chunking → Embeddings → ChromaDB → Retrieve → LLM → Response

Tech Stack

  • FastAPI
  • Uvicorn
  • Sentence Transformers
  • ChromaDB(Vector Database)
  • Groq LLM API
  • Docker

Quick Start

Option 1 — Run with Docker

Install Docker

Download and install Docker Desktop from google:

  • Verify installation from the terminsl/cmd:

docker --version

  • Build the image:

docker build -t rag-api .

  • Run the container:

export GROQ_API_KEY="your_api_key_here" docker run -p 8000:8000 -e GROQ_API_KEY=$GROQ_API_KEY rag-api

  • Open in browser:

http://localhost:8000/docs

Option 2 — Run Locally (Without Docker)

  • Create virtual environment:

python -m venv .venv source .venv/bin/activate

  • Install dependencies:

pip install -r requirements.txt

  • Run server:

export GROQ_API_KEY="your_api_key_here" uvicorn app:app --reload

  • Open:

http://127.0.0.1:8000/docs


API Endpoints

POST /documents
Upload and index a PDF.

POST /chat
Ask questions grounded in the uploaded document.

Environment Variable

This project requires:

GROQ_API_KEY

Do NOT hardcode it inside the source code.

Deployment

The application is fully Dockerized and ready for deployment to:

  • Google Cloud Run
  • AWS ECS
  • Azure Container Apps
  • Any container-based platform

Key Concepts Demonstrated

  • Retrieval-Augmented Generation (RAG)
  • Semantic Search
  • Vector Databases
  • Embedding-based Similarity
  • Backend API Design
  • Containerization with Docker