A powerful document question-answering system powered by Docling and LangGraph, featuring a multi-agent workflow for accurate and verified answers.
- Multi-Agent Workflow: Research, verification, and relevance checking agents
- Document Processing: Support for PDF, DOCX, TXT, and MD files
- Hybrid Retrieval: Combines dense and sparse retrieval for better results
- Answer Verification: Built-in fact-checking and verification
- Modern Chat Interface: Powered by Chainlit
- Session Management: Efficient document caching and state management
- Python 3.10+ (recommended)
- uv (for fast dependency management)
-
Clone the repository:
git clone <repository-url> cd Multi-Agent-RAG
-
Install dependencies using
uv:uv pip install -e .If you don’t have
uvinstalled, get it with:pip install uv
Start the Chainlit chat interface:
chainlit run chainlit.pyThe interface will be available at http://localhost:8000.
- Upload your documents (PDF, DOCX, TXT, MD) in the chat interface.
- The system will process and index them automatically.
- Documents are cached for efficient reuse.
- Type your question in the chat.
- The system will:
- Check document relevance
- Research the answer using retrieved documents
- Verify the answer for accuracy
- Provide both the answer and verification report
- Relevance Checker: Determines if documents can answer the question
- Research Agent: Generates initial answers from retrieved documents
- Verification Agent: Checks answer accuracy and provides verification report
- File Handler: Processes various document formats
- Chunking: Splits documents into manageable chunks
- Embedding: Creates vector representations for retrieval
- Hybrid Retriever: Combines dense (vector) and sparse (BM25) retrieval
- ChromaDB: Vector database for document storage
- Ensemble: Merges results from multiple retrieval methods
Settings are managed through the config/ directory:
constants.py: System constants and file type definitionssettings.py: Environment-specific settings
- PDF (
.pdf) - Word documents (
.docx) - Text files (
.txt) - Markdown files (
.md)
Multi-Agent-RAG/
├── agent/ # Multi-agent workflow
│ ├── workflow.py # Main workflow orchestration
│ ├── research_agent.py # Research agent implementation
│ ├── verification_agent.py # Verification agent
│ └── relevance_checker.py # Relevance checking
├── retriever/ # Document retrieval system
│ ├── builder.py # Retriever construction
│ └── file_handler.py # Document processing
├── config/ # Configuration files
├── utils/ # Utility functions
├── chainlit.py # Chainlit interface (entrypoint)
├── pyproject.toml # Project dependencies
└── uv.lock # uv dependency lockfile
- Create a new agent class in the
agent/directory - Implement the required interface methods
- Add the agent to the workflow in
agent/workflow.py
- Add new file type to
constants.ALLOWED_TYPES - Implement processing logic in
retriever/file_handler.py - Update the interface validation
- Document Processing Errors: Ensure files are not corrupted and in supported formats
- Memory Issues: Large documents may require more memory allocation
- Retrieval Performance: Consider adjusting chunk sizes or retrieval parameters
The system uses structured logging. Check logs for detailed error information:
tail -f chainlit.log # For Chainlit interface