A Chainlit-based chat interface for Ollama models with RAG (Retrieval Augmented Generation) capabilities.
- Chat with multiple Ollama models
- Upload and query documents using RAG
- Persistent chat history throughout the conversation
- Streaming responses for a more natural conversation flow
- Model selection via dropdown menu
- Python 3.8+
- Ollama installed and running locally (https://ollama.ai)
- Models pulled into Ollama (e.g.,
ollama pull llama3.2:latest
)
- Clone this repository:
git clone <repository-url>
cd <repository-directory>
- Install dependencies:
pip install -r requirements.txt
- Start the Chainlit app:
chainlit run app.py
-
Open your browser and navigate to: http://localhost:8000
-
Select an Ollama model from the dropdown menu
-
Upload documents using the upload button
-
Start chatting!
This application:
- Uses the
OllamaChat
class from the RAG-Agent codebase for chat interactions - Uses the
Loader_Local
class for document loading and retrieval - Maintains chat history in the same format as the original code
- Retrieves relevant document chunks when answering questions
- Preserves chat context in the format: "User: {question}\n{answer}"
You can modify the chainlit.config.toml
file to customize the UI and behavior.
- Ensure Ollama is running and accessible
- Check that you have pulled the necessary models into Ollama
- If document retrieval is not working, check the format of your documents