Combining document retrieval with large language models to deliver accurate, source-grounded responses
Navigation
The Marbet AI Event Assistant is a Retrieval-Augmented Generation (RAG) system that provides accurate, context-aware responses to user queries by leveraging a curated collection of event documents. The system ensures factual accuracy by grounding all responses in the provided source material, eliminating hallucinations commonly found in standalone language models.
Built with modularity and flexibility in mind, the system supports both local and cloud-based language models, making it suitable for various deployment scenarios and security requirements.
graph TB
subgraph "Client Layer"
UI[React Frontend]
CLI[Command Line Interface]
end
subgraph "API Layer"
API[Flask REST API]
CORS[CORS Handler]
end
subgraph "Processing Layer"
RAG[RAG Orchestrator]
RET[Document Retriever]
LLM[Language Model]
end
subgraph "Storage Layer"
VDB[(ChromaDB Vector Store)]
DOCS[PDF Documents]
end
subgraph "External Services"
OLLAMA[Ollama Server]
GEMINI[Google Gemini API]
end
UI --> API
CLI --> RAG
API --> RAG
RAG --> RET
RAG --> LLM
RET --> VDB
LLM --> OLLAMA
LLM --> GEMINI
DOCS --> VDB
style UI fill:#e1f5fe
style API fill:#f3e5f5
style RAG fill:#e8f5e8
style VDB fill:#fff3e0
- Document Ingestion: PDF files are processed, chunked, and converted to vector embeddings
- Query Processing: User queries are received via web UI or CLI
- Context Retrieval: Relevant document chunks are retrieved from the vector store
- Response Generation: LLM generates contextual responses using retrieved information
- Result Delivery: Answers with source citations are returned to the user
Get up and running in under 5 minutes
# Verify Python version (3.9+ required)
python --version
# Verify Node.js version (16+ required)
node --version
# Check if Tesseract is installed
tesseract --versionStep 1: Clone Repository
git clone https://github.com/soheil-mp/event-assistant-llm.git
cd event-assistant-llmStep 2: Backend Setup
# Create and activate virtual environment
python -m venv venv
# Windows
.\venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtStep 3: Frontend Setup
cd frontend
npm install
cd ..Step 4: Environment Configuration
Create .env file in project root:
# LLM Provider Selection
LLM_SOURCE="gemini" # or "ollama"
# Google Gemini Configuration (if using cloud)
GEMINI_API_KEY="your_api_key_here"
GEMINI_LLM_MODEL="gemini-1.5-flash-latest"
# Ollama Configuration (if using local)
OLLAMA_BASE_URL="http://localhost:11434"
OLLAMA_LLM_MODEL="deepseek-r1:32b"Step 5: Add Documents & Launch
# Add your PDF documents
cp your-documents/*.pdf data/documents/
# Start backend (processes documents on first run)
python api.py
# Start frontend (in new terminal)
cd frontend && npm run devAccess at: http://localhost:5173
| Platform | Windows 10+, macOS 10.15+, Ubuntu 18.04+ |
| Python | 3.9.0 or higher |
| Node.js | 16.0.0 or higher |
| Memory | 4GB RAM minimum, 8GB recommended |
| Storage | 2GB free space for dependencies and vector store |
Windows Installation
- Download installer from Tesseract at UB Mannheim
- Run installer with default settings
- Add installation path to system PATH:
C:\Program Files\Tesseract-OCR - Verify installation:
tesseract --version
macOS Installation
# Using Homebrew (recommended)
brew install tesseract
# Verify installation
tesseract --versionLinux Installation
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install tesseract-ocr tesseract-ocr-eng
# CentOS/RHEL/Fedora
sudo yum install tesseract tesseract-langpack-eng
# Verify installation
tesseract --versionThe application uses environment variables for configuration. Create a .env file in the project root:
Google Gemini Configuration (Cloud)
# LLM Provider
LLM_SOURCE="gemini"
# Gemini API Settings
GEMINI_API_KEY="your_api_key_here"
GEMINI_LLM_MODEL="gemini-1.5-flash-latest"
GEMINI_EMBEDDING_MODEL="models/embedding-001"
# General Settings
LLM_TEMPERATURE="0.0"
CHUNK_SIZE="128"
CHUNK_OVERLAP="20"
RETRIEVER_K="100"
FORCE_REBUILD_VECTOR_STORE="False"Setup Instructions:
- Visit Google AI Studio
- Create a new API key
- Add the key to your
.envfile
Ollama Configuration (Local)
# LLM Provider
LLM_SOURCE="ollama"
# Ollama Settings
OLLAMA_BASE_URL="http://localhost:11434"
OLLAMA_LLM_MODEL="deepseek-r1:32b"
EMBEDDING_MODEL="mxbai-embed-large:latest"
# General Settings
LLM_TEMPERATURE="0.0"
CHUNK_SIZE="128"
CHUNK_OVERLAP="20"
RETRIEVER_K="100"
FORCE_REBUILD_VECTOR_STORE="False"Setup Instructions:
- Install Ollama
- Pull required models:
ollama pull deepseek-r1:32b ollama pull mxbai-embed-large:latest
- Start Ollama server:
ollama serve
| Parameter | Description | Default | Options |
|---|---|---|---|
LLM_SOURCE |
Language model provider | gemini |
gemini, ollama |
GEMINI_API_KEY |
Google Gemini API key | - | Your API key |
OLLAMA_BASE_URL |
Ollama server endpoint | http://localhost:11434 |
Valid URL |
CHUNK_SIZE |
Document chunk size | 128 |
64-512 tokens |
CHUNK_OVERLAP |
Overlap between chunks | 20 |
10-50 tokens |
RETRIEVER_K |
Documents to retrieve | 100 |
10-200 |
FORCE_REBUILD_VECTOR_STORE |
Force vector store rebuild | False |
True, False |
Add your PDF documents to the data/documents/ directory. The system will automatically:
- Process new documents on startup
- Extract text using OCR when needed
- Create vector embeddings
- Store them in the local ChromaDB instance
|
Starting the Application # Terminal 1: Start backend API
python api.py
# Terminal 2: Start frontend
cd frontend
npm run devFirst Run Notes:
|
Using the Interface
Example Queries:
|
For development and testing purposes:
python main.pyInteractive Session:
--- Marbet Event Assistant CLI Ready ---
Ask questions about the event (type 'quit' to exit).
Assistant: Parking is available in the adjacent parking structure.
Level B1 is reserved for event attendees with validation.
Retrieved Sources:
- Event_Logistics.pdf, Page 3: "Parking structure - Level B1 reserved"
POST /api/chat
Send a message to the chatbot and receive an AI-generated response with source attribution.
Request Format
POST /api/chat
Content-Type: application/json
{
"message": "string",
"history": [
{
"sender": "user|ai",
"text": "string"
}
]
}Parameters:
message(required): The user's question or queryhistory(optional): Array of previous conversation messages
Response Format
HTTP/1.1 200 OK
Content-Type: application/json
{
"answer": "string",
"retrieved_context": [
{
"metadata": {
"source": "document.pdf",
"page": 1
}
}
],
"has_citations": true
}Response Fields:
answer: The generated response textretrieved_context: Metadata for documents used in the responsehas_citations: Boolean indicating if sources were found and cited
Error Responses
HTTP/1.1 400 Bad Request
{
"error": "Missing 'message' in request body"
}
HTTP/1.1 500 Internal Server Error
{
"error": "Chatbot is not initialized. Please check server logs."
}Python Example
import requests
# Basic chat request
response = requests.post('http://localhost:5000/api/chat', json={
'message': 'What time does the event start?',
'history': []
})
if response.status_code == 200:
data = response.json()
print(f"Answer: {data['answer']}")
print(f"Has citations: {data['has_citations']}")
else:
print(f"Error: {response.status_code}")JavaScript Example
const response = await fetch('http://localhost:5000/api/chat', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: 'What time does the event start?',
history: []
})
});
const data = await response.json();
console.log('Answer:', data.answer);Handles chat requests.
Request Body:
{
"message": "string",
"history": [
{"sender": "user", "text": "string"},
{"sender": "ai", "text": "string"}
]
}Success Response (200 OK):
{
"answer": "string",
"retrieved_context": [
{
"metadata": {
"source": "string",
"page": "number"
}
}
],
"has_citations": "boolean"
}Error Responses:
400 Bad Request: If themessagefield is missing.500 Internal Server Error: If the chatbot fails to initialize or an error occurs during processing.
event-assistant-llm/
βββ π src/marbet_rag/ # Core RAG implementation
β βββ __init__.py # Package initialization
β βββ data_processing.py # Document loading and chunking
β βββ retrieval.py # Vector store and RAG chain setup
β βββ prompts.py # System prompts and templates
β βββ utils.py # Helper functions and utilities
βββ π frontend/ # React web interface
β βββ π src/ # React source code
β β βββ π components/ # UI components
β β βββ App.jsx # Main application component
β β βββ main.jsx # Application entry point
β βββ π public/ # Static assets
β βββ package.json # Frontend dependencies
β βββ vite.config.js # Vite configuration
βββ π data/ # Data directory
β βββ π documents/ # Source PDF documents
β βββ π vector_store/ # Generated ChromaDB storage
βββ π assets/ # Demo images and documentation assets
βββ π notebooks/ # Jupyter notebooks for experimentation
βββ api.py # Flask API server
βββ main.py # CLI interface
βββ config.py # Configuration management
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
|
Setup Development Environment # Clone repository
git clone <repository-url>
cd event-assistant-llm
# Setup Python environment
python -m venv venv
source venv/bin/activate # Windows: .\venv\Scripts\activate
pip install -r requirements.txt
# Setup frontend
cd frontend
npm install
cd .. |
Development Commands # Start backend in development mode
python api.py
# Start frontend with hot reload
cd frontend && npm run dev
# Run CLI for testing
python main.py
# Build frontend for production
cd frontend && npm run build |
Running Tests
# Install test dependencies
pip install pytest pytest-cov
# Run all tests
python -m pytest
# Run with coverage
python -m pytest --cov=src
# Run specific test file
python -m pytest tests/test_rag.py -vLinting and Formatting
# Python code formatting
pip install black flake8
black src/ --line-length 88
flake8 src/ --max-line-length 88
# JavaScript/React linting
cd frontend
npm run lint
npm run lint:fix| Component | Technology | Purpose |
|---|---|---|
| UI Library | User interface components | |
| Build Tool | Development and build system | |
| HTTP Client | API communication |
| Service | Provider | Integration |
|---|---|---|
| Local LLM | Self-hosted language models | |
| Cloud LLM | Google Generative AI API |