A LangGraph implementation of the CRAG paper, a self-correcting RAG pipeline that evaluates retrieved documents before using them for generation.
Instead of blindly trusting every retrieved document, CRAG adds a quality gate that scores each document and decides whether to use it, discard it, search the web, or do both.
- Retrieve docs from FAISS vector store
- Evaluate each doc with a lightweight LLM (Llama 3.2)
- Route based on verdict:
- Correct → Refine docs → Generate
- Incorrect → Rewrite query → Web search → Refine → Generate
- Ambiguous → Refine docs + Web search → Generate
- Refine by decomposing docs into sentence-level strips and filtering out irrelevant ones
- Generate the final answer using GPT-4o
- LangGraph — Pipeline orchestration with conditional routing
- FAISS — Vector store for document retrieval
- OpenAI (GPT-4o) — Generation and strip filtering
- Ollama (Llama 3.2:3b) — Lightweight local model for document evaluation
- Tavily — Web search fallback
- LangChain — Chains, prompts, and document handling
uv init .
uv add faiss-cpu huggingface ipykernel langchain langchain-community langchain-ollama langchain-openai langchain-tavily langchain-text-splitters pypdf python-dotenvCreate a .env file in the root directory:
OPENAI_API_KEY=your_openai_key
TAVILY_API_KEY=your_tavily_key
ollama pull llama3.2:3bPlace your PDF in the ./docs/ folder. The notebook uses harrypotter.pdf by default.
Open the Jupyter notebook and run all cells top to bottom. The first run will take a while to load and embed the PDF — after that it's cached.