This project demonstrates a document ingestion pipeline for Retrieval-Augmented Generation (RAG) using the LangChain framework with an Ollama local LLM backend. The pipeline converts local unstructured files into vector embeddings and stores them in a Chroma vector database, enabling semantic search and question answering.
- LangChain: Modular framework for building LLM-powered applications, used for document loading, transformation, and RAG chain creation.
- Ollama + LLM (e.g., Ollama3/Mistral): Local language model used as the reasoning engine.
- Chroma: Embedded vector database for storing and retrieving document embeddings.
- FastEmbed: Lightweight, high-speed embedding model for converting text to vector representations.
- Streamlit: For a simple interactive front-end.
- Loads PDF and TXT documents from local
./datadirectory. - Splits documents into manageable chunks using recursive character splitting.
- Generates vector embeddings using
FastEmbedEmbeddings. - Stores processed documents and metadata in a persistent Chroma DB.
- Sets up a basic RAG pipeline to answer questions using retrieved chunks and Ollama-powered LLM responses.
-
Install dependencies:
pip install langchain langchain-community chromadb fastembed
-
Start Ollama with a supported model
curl -fsSL https://ollama.com/install.sh | shMake sure Ollama is running and the model (e.g., llama3) is downloaded.
ollama run llama3
-
Run the notebook
streamlit run streamlit_app.py
.
├── streamlit_app.py # Main retrieval pipeline logic
├── data/ # Directory with PDF and TXT documents
└── chroma_db/ # Auto-created directory for Chroma vector store
After ingesting Golden Visa documents, you can query:
"What are the requirements for Portugal's Golden Visa?"
The system will retrieve semantically relevant document chunks and generate a response using the local LLM.

