rag-based-qa-system

RAG-Based Question Answering System

Objective

Build an API that allows users to upload documents and ask questions using a Retrieval-Augmented Generation (RAG) approach.

Architecture Overview

The system ingests documents asynchronously, chunks text, generates embeddings, stores them in FAISS, retrieves relevant chunks, and generates answers.

Chunking Strategy

Chunk size chosen: 500 tokens
Overlap: 100 tokens

This chunk size was selected to balance semantic coherence and retrieval accuracy. Smaller chunks improve recall but often lose context, while very large chunks reduce retrieval precision. A size of 500 tokens preserves paragraph-level meaning and fits comfortably within LLM context limits. The overlap prevents loss of information at chunk boundaries.

Retrieval Failure Case

A retrieval failure was observed when asking high-level conceptual questions such as “What is the motivation behind the algorithm?” In such cases, the retriever returned implementation-focused chunks instead of conceptual explanations.

This happened because embedding similarity favors technical terms and keyword-rich sections. This issue can be improved by increasing the top-k retrieval size or combining keyword-based (BM25) and embedding-based retrieval.

Metric Tracked

Metric: End-to-end query latency

Latency was tracked to measure user-perceived performance. On average:

Query embedding took ~20 ms
FAISS similarity search took ~5 ms
LLM response generation took ~900 ms

Tracking latency helped identify that the LLM was the primary bottleneck in the system.

API Endpoints

POST /upload – Upload PDF or TXT
POST /query – Ask question from uploaded document

Setup

pip install -r requirements.txt
uvicorn app.main:app --reload

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
app		app
README.md		README.md
architecture.png		architecture.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rag-based-qa-system

RAG-Based Question Answering System

Objective

Architecture Overview

Chunking Strategy

Retrieval Failure Case

Metric Tracked

Metric Tracked

API Endpoints

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rag-based-qa-system

RAG-Based Question Answering System

Objective

Architecture Overview

Chunking Strategy

Retrieval Failure Case

Metric Tracked

Metric Tracked

API Endpoints

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages