A simple self-correcting and hallucination-checking RAG (Retrieval-Augmented Generation) workflow built using LangGraph, running LLMs locally via Ollama.
- Retrieval-Augmented Generation (RAG): Leverages external knowledge sources (web search via Tavily and a local vector store) to enhance generation quality.
- Local LLM Execution: Uses Ollama (
qwen3:30bby default) to run language models locally, avoiding reliance on external APIs for core generation. - LangGraph: Implements the workflow as a state graph for clear logic and execution flow.
- Vector Store: Uses
SKLearnVectorStorewithNomicEmbeddings(local inference) for efficient document retrieval. - Self-Correction: Includes mechanisms to review and refine generated answers based on retrieved documents.
- Hallucination Checking: Incorporates steps to verify the factual consistency of the generated output against source documents.
- Optional Caching: Supports caching the vector store embeddings locally (
.parquetformat) to speed up subsequent runs (requirespandasandpyarrow).
This project uses LangGraph to define a cyclical process where:
- A question is received.
- The question is routed to either a web search (using Tavily) or the local vector store.
- Relevant documents are retrieved (from web or vector store).
- Retrieved documents are graded for relevance to the question.
- An initial answer is generated by a local LLM (via Ollama) based on the relevant documents.
- The answer is checked against the source documents for factual consistency (hallucination check) and relevance to the original question.
- If inconsistencies or irrelevance are found, the process attempts to self-correct (e.g., by performing a web search if not already done, or regenerating the answer) up to a maximum number of retries.
- A final, verified answer is provided.
-
Clone the repository:
git clone https://github.com/safzanpirani/langgraph-agentic-workflow cd langgraph-agentic-workflow -
Install Ollama: Follow the instructions at https://ollama.com/ to install Ollama on your system.
-
Pull the LLM Model: Download the model used in the script (or choose another compatible one):
ollama pull qwen3:30b
-
Install Python dependencies:
# Make sure you have pip or a similar package manager (like uv) pip install -r requirements.txt # OPTIONAL: For vector store caching # pip install pandas pyarrow
-
Configure environment variables: Create a
.envfile in the root directory and add your Tavily API key:# Required for the web search tool TAVILY_API_KEY="your_tavily_api_key_here"
Get your key by signing up at Tavily.com
- Ensure Ollama is running: Start the Ollama application/server if it's not already running in the background.
- Run the main workflow script:
python main.py
Follow the terminal output to see the agent's reasoning and the final answer.
Watch a live demonstration of the agentic RAG workflow in action: Click here to watch the demo.
