A question-answering system that uses Retrieval-Augmented Generation (RAG) to answer questions based on custom documents. Built with LangChain, FAISS vector database, and Groq's Llama 3.1 model.
- Web UI: Beautiful Streamlit interface for easy interaction
- PDF Support: Now supports both TXT and PDF files
- Better Error Handling: Clear error messages and validation
- Evaluation Metrics: Relevance scoring for answers
- Docker Ready: Easy deployment with Docker/Docker Compose
This project implements a RAG pipeline that can ingest text documents, create embeddings, and answer questions based on the content. Instead of relying on the LLM's general knowledge, it retrieves relevant information from your documents first, then generates accurate answers.
- Document ingestion with automatic text chunking
- Support for TXT and PDF files
- Vector similarity search using FAISS
- Free LLM inference using Groq API (Llama 3.1-8B)
- Local embeddings with HuggingFace transformers
- Query logging with relevance scoring
- Web interface using Streamlit
- Docker containerization for easy deployment
Modern web interface built with Streamlit for easy interaction
The system retrieves relevant context from documents and generates accurate answers.
Documents are loaded, split into chunks, and converted to vector embeddings.
The system operates in three steps:
- Ingestion: Documents are split into chunks and converted to vector embeddings
- Retrieval: When you ask a question, the system finds the most relevant chunks
- Generation: The LLM uses these chunks as context to generate an answer
Clone the repository:
git clone https://github.com/satzgits/RAG-based-Question-Answering-System-.git
cd RAG-based-Question-Answering-System-Install dependencies:
pip install -r requirements.txtSet up your API key:
- Get a free API key from console.groq.com
- Create a
.envfile in the project root - Add your key:
GROQ_API_KEY=your_key_here
Start the Streamlit app:
streamlit run app.pyThen open your browser to http://localhost:8501
Features:
- Upload documents through the web interface
- Process documents with a click
- Ask questions and see results instantly
- View source documents
Step 1: Ingest your documents
Place your .txt or .pdf files in the data/ folder, then run:
python src/main.py --ingestStep 2: Ask questions
python src/main.py --query "What is quantum computing?"The repository includes demo documents on various topics. Here are some questions you can try:
python src/main.py --query "What is the difference between narrow AI and general AI?"
python src/main.py --query "How does blockchain work?"
python src/main.py --query "What are the main causes of climate change?"docker-compose upThen visit http://localhost:8501
Build the image:
docker build -t rag-qa-system .Run the container:
docker run -p 8501:8501 -v $(pwd)/data:/app/data -v $(pwd)/.env:/app/.env rag-qa-system├── app.py # Streamlit web interface
├── data/ # Place your documents here
├── src/
│ ├── ingest.py # Document processing and vectorization
│ ├── rag_pipeline.py # RAG chain implementation
│ └── main.py # Command-line interface
├── faiss_index/ # Generated vector database
├── screenshots/ # Demo screenshots
├── rag_evaluation.log # Query logs with metrics
├── requirements.txt # Python dependencies
├── Dockerfile # Docker configuration
├── docker-compose.yml # Docker Compose setup
└── README.md
Documents are split using RecursiveCharacterTextSplitter with:
- Chunk size: 500 characters
- Overlap: 50 characters
I experimented with different chunk sizes (200, 500, 1000) and found 500 to be a good balance between context and relevance.
The retriever fetches the top 3 most similar chunks for each query. This can be adjusted in src/rag_pipeline.py:
retriever = db.as_retriever(search_kwargs={'k': 3})All queries are logged to rag_evaluation.log with:
- Timestamps
- Questions and answers
- Source documents
- Relevance scores
This helps track performance and identify cases where the system doesn't have enough context.
- LangChain: Framework for building the RAG pipeline
- FAISS: Facebook's library for efficient similarity search
- Groq: Fast LLM inference API (free tier available)
- HuggingFace: Sentence transformers for local embeddings
- Streamlit: Modern web interface
- Docker: Containerization for easy deployment
- Python 3.x
From the original project, I added:
- PDF Support: Can now process PDF files in addition to text files
- Web Interface: User-friendly Streamlit UI with file upload
- Better Error Handling: Clear error messages for common issues
- Evaluation Metrics: Relevance scoring for answer quality
- Docker Support: Easy deployment and containerization
- Enhanced Logging: More detailed logging for debugging
Current limitations:
- Embeddings model runs locally (can be slow on first run)
- No re-ranking of retrieved documents
- Simple relevance scoring
Possible future enhancements:
- Add hybrid search (keyword + semantic)
- Implement document re-ranking
- Fine-tune embeddings on domain-specific data
- Add conversation history support
- Multi-language support
MIT License