RAG-based Question Answering System

A question-answering system that uses Retrieval-Augmented Generation (RAG) to answer questions based on custom documents. Built with LangChain, FAISS vector database, and Groq's Llama 3.1 model.

New Features ✨

Web UI: Beautiful Streamlit interface for easy interaction
PDF Support: Now supports both TXT and PDF files
Better Error Handling: Clear error messages and validation
Evaluation Metrics: Relevance scoring for answers
Docker Ready: Easy deployment with Docker/Docker Compose

Overview

This project implements a RAG pipeline that can ingest text documents, create embeddings, and answer questions based on the content. Instead of relying on the LLM's general knowledge, it retrieves relevant information from your documents first, then generates accurate answers.

Features

Document ingestion with automatic text chunking
Support for TXT and PDF files
Vector similarity search using FAISS
Free LLM inference using Groq API (Llama 3.1-8B)
Local embeddings with HuggingFace transformers
Query logging with relevance scoring
Web interface using Streamlit
Docker containerization for easy deployment

Demo

Web Interface

Modern web interface built with Streamlit for easy interaction

Example Query

The system retrieves relevant context from documents and generates accurate answers.

Document Ingestion Process

Documents are loaded, split into chunks, and converted to vector embeddings.

How It Works

The system operates in three steps:

Ingestion: Documents are split into chunks and converted to vector embeddings
Retrieval: When you ask a question, the system finds the most relevant chunks
Generation: The LLM uses these chunks as context to generate an answer

Installation

Clone the repository:

git clone https://github.com/satzgits/RAG-based-Question-Answering-System-.git
cd RAG-based-Question-Answering-System-

Install dependencies:

pip install -r requirements.txt

Set up your API key:

Get a free API key from console.groq.com
Create a .env file in the project root
Add your key: GROQ_API_KEY=your_key_here

Usage

Option 1: Web Interface (Recommended)

Start the Streamlit app:

streamlit run app.py

Then open your browser to http://localhost:8501

Features:

Upload documents through the web interface
Process documents with a click
Ask questions and see results instantly
View source documents

Option 2: Command Line

Step 1: Ingest your documents

Place your .txt or .pdf files in the data/ folder, then run:

python src/main.py --ingest

Step 2: Ask questions

python src/main.py --query "What is quantum computing?"

Sample Queries

The repository includes demo documents on various topics. Here are some questions you can try:

python src/main.py --query "What is the difference between narrow AI and general AI?"
python src/main.py --query "How does blockchain work?"
python src/main.py --query "What are the main causes of climate change?"

Docker Deployment

Using Docker Compose (Easiest)

docker-compose up

Then visit http://localhost:8501

Using Docker

Build the image:

docker build -t rag-qa-system .

Run the container:

docker run -p 8501:8501 -v $(pwd)/data:/app/data -v $(pwd)/.env:/app/.env rag-qa-system

Project Structure

├── app.py                    # Streamlit web interface
├── data/                     # Place your documents here
├── src/
│   ├── ingest.py            # Document processing and vectorization
│   ├── rag_pipeline.py      # RAG chain implementation
│   └── main.py              # Command-line interface
├── faiss_index/             # Generated vector database
├── screenshots/             # Demo screenshots
├── rag_evaluation.log       # Query logs with metrics
├── requirements.txt         # Python dependencies
├── Dockerfile               # Docker configuration
├── docker-compose.yml       # Docker Compose setup
└── README.md

Technical Details

Chunking Strategy

Documents are split using RecursiveCharacterTextSplitter with:

Chunk size: 500 characters
Overlap: 50 characters

I experimented with different chunk sizes (200, 500, 1000) and found 500 to be a good balance between context and relevance.

Retrieval Settings

The retriever fetches the top 3 most similar chunks for each query. This can be adjusted in src/rag_pipeline.py:

retriever = db.as_retriever(search_kwargs={'k': 3})

Evaluation

All queries are logged to rag_evaluation.log with:

Timestamps
Questions and answers
Source documents
Relevance scores

This helps track performance and identify cases where the system doesn't have enough context.

Technologies Used

LangChain: Framework for building the RAG pipeline
FAISS: Facebook's library for efficient similarity search
Groq: Fast LLM inference API (free tier available)
HuggingFace: Sentence transformers for local embeddings
Streamlit: Modern web interface
Docker: Containerization for easy deployment
Python 3.x

Improvements Made

From the original project, I added:

PDF Support: Can now process PDF files in addition to text files
Web Interface: User-friendly Streamlit UI with file upload
Better Error Handling: Clear error messages for common issues
Evaluation Metrics: Relevance scoring for answer quality
Docker Support: Easy deployment and containerization
Enhanced Logging: More detailed logging for debugging

Limitations and Future Improvements

Current limitations:

Embeddings model runs locally (can be slow on first run)
No re-ranking of retrieved documents
Simple relevance scoring

Possible future enhancements:

Add hybrid search (keyword + semantic)
Implement document re-ranking
Fine-tune embeddings on domain-specific data
Add conversation history support
Multi-language support

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-based Question Answering System

New Features ✨

Overview

Features

Demo

Web Interface

Example Query

Document Ingestion Process

How It Works

Installation

Usage

Option 1: Web Interface (Recommended)

Option 2: Command Line

Sample Queries

Docker Deployment

Using Docker Compose (Easiest)

Using Docker

Project Structure

Technical Details

Chunking Strategy

Retrieval Settings

Evaluation

Technologies Used

Improvements Made

Limitations and Future Improvements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
screenshots		screenshots
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
USAGE_GUIDE.md		USAGE_GUIDE.md
app.py		app.py
docker-compose.yml		docker-compose.yml
rag_evaluation.log		rag_evaluation.log
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

RAG-based Question Answering System

New Features ✨

Overview

Features

Demo

Web Interface

Example Query

Document Ingestion Process

How It Works

Installation

Usage

Option 1: Web Interface (Recommended)

Option 2: Command Line

Sample Queries

Docker Deployment

Using Docker Compose (Easiest)

Using Docker

Project Structure

Technical Details

Chunking Strategy

Retrieval Settings

Evaluation

Technologies Used

Improvements Made

Limitations and Future Improvements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages