Building RAG and Agentic Applications with LangChain/LangGraph 1.0 and Haystack 2.0

This is the code repository for Building Natural Language Pipelines, published by Packt.

Author: Laura Funderburk

What You'll Learn to Build
Setting Up
Chapter Breakdown
- Chapter 1: Introduction to natural language processing pipelines (no required code exercises)
- Chapter 2: Diving Deep into Large Language Models
- Chapter 3: Introduction to Haystack
- Chapter 4: Bringing components together: Haystack pipelines for different use cases
- Chapter 5: Haystack pipeline development with custom components
- Chapter 6: Setting up a reproducible project: naive vs hybrid RAG with reranking and evaluation
- Chapter 7: Production deployment strategies
- Chapter 8: Hands-on projects
- Chapter 9: Future trends and beyond (no required code exercises)
- Optional: Advanced multi-agent architecture for production

What You'll Learn to Build

This book guides you through building advanced Retrieval-Augmented Generation (RAG) systems and multi-agent applications using the Haystack 2.0, Ragas and LangGraph frameworks. Beginning with state-based agent development using LangGraph, you'll learn to build intelligent agents with tool integration, middleware patterns, and multi-agent coordination. You'll then master Haystack's component architecture, progressing through creating intelligent search systems with semantic and hybrid retrieval, building custom components for specialized tasks, and implementing comprehensive evaluation frameworks. The journey advances through production deployment strategies with Docker and REST APIs, culminating in hands-on projects including named entity recognition systems, zero-shot text classification pipelines, sentiment analysis tools, and sophisticated multi-agent orchestration systems that coordinate multiple specialized Haystack pipelines through supervisor-worker patterns with LangGraph.

Chapter 2: Single agents and multi agents with LangChain and LangGraph

This chapter contains optional LangGraph demonstrations that introduce state-based agents at a conceptual level. These examples are previews intended to build intuition. The full, practical use of LangGraph for multi-agent orchestration appears later in Chapter 8 and the epilogue, once the Haystack tool layer has been fully developed.


Agent with one tool	Agent calling supervisor

Chapter 3: Building robust agent tools with Haystack


Supercomponents and pipeline	Prompt template pipeline

Chapter 4: RAG pipelines: indexing and retrieval


Indexing pipeline	Hybrid RAG pipeline

Chapter 5: Build custom components: synthetic data generation with Ragas


Knowledge graph and synthetic data generation (SDG) pipeline	SDG applied to websites and PDFS

Chapter 6: Reproducible evaluation of hybrid and naive RAG with Ragas and Weights and Biases

Chapter 7: Deploy pipelines as an API with FastAPI and Hayhooks

Chapter 8 and Optional Advanced Modules: Capstone and Agentic Patterns for Production

Microservice architecture Multi-agent system using microservices

📝 Sovereign-Friendly & Local Execution: The majority of exercises throughout this book are written so you can choose between OpenAI APIs or local models via Ollama (such as Mistral Nemo, GPT-OSS, or Deepseek-R1 and Qwen3), with the exception of the cost tracking exercises in Chapter 6 which specifically demonstrate OpenAI API usage monitoring. Each notebook provides specific model recommendations to help you choose the most suitable option for that particular exercise. The frameworks explored are extensible and models from other providers can be used to substitute OpenAI or local models. No US cloud, external APIs, or proprietary services are required for the majority of the book, making it easy to run in EU-regulated or air-gapped environments. The epilogue-advanced folder includes an optional prototype-to-production multi-agent implementation with LangGraph using LangSmith Studio. These exercises require a free LangSmith Studio API key, all exercises can also be run entirely locally and you can disable the tracer export LANGCHAIN_TRACING_V2="false". Scripts are provided so you can run the agent on your terminal - you simply won’t see the studio traces or visualize the agent if you choose not to use LangSmith studio.

Setting up

Clone the repository

git clone https://github.com/PacktPublishing/Building-Natural-Language-Pipelines.git cd Building-Natural-Language-Pipelines/

Each chapter contains a pyproject.toml file with the folder's dependencies. (Recommended) Open each folder in a new VS Code window.

Install uv:
pip install uv

Change directories into the folder

Install dependencies:
uv sync

Activate the virtual environment:
source .venv/bin/activate

Select the virtual environment as the Jupyter kernel:

Open any notebook.

Click the kernel picker (top right) and select the .venv environment.

Chapter breakdown

Chapter 2: Diving Deep into Large Language Models

Agent Foundations & State Management

LangGraph Fundamentals: Understanding state-based agent frameworks and graph architecture

Building Simple Agents: Creating agents with state management using MessagesState and reducers

Tool Integration: Connecting agents with external tools (search APIs, databases, custom functions)

Multi-Agent Systems: Designing and coordinating multiple specialized agents in workflows

Middleware Patterns: Implementing logging, authentication, and monitoring layers for agent systems

Local vs Cloud LLMs: Running agents with OpenAI APIs or locally with Ollama (Qwen2, Llama, Mistral)

Chapter 3: Introduction to Haystack

Core Concepts & Foundation

Component Architecture: Understanding Haystack's modular design patterns

Pipeline Construction: Building linear and branching data flow pipelines

Document Processing: Text extraction, cleaning, and preprocessing workflows

Prompting LLMs: Learn to build prompt templates and guide how an LLM responds

Package pipelines as Supercomponents: Abstract a pipeline as a Haystack component

Chapter 4: Bringing components together: Haystack pipelines for different use cases

Scaling & Optimization

Indexing Pipelines: Automated document ingestion and preprocessing workflows

Naive RAG: Semantic search using sentence transformers and embedding models

Hybrid RAG: Combining keyword (BM25) and semantic (vector) search strategies

Reranking: Advanced retrieval techniques using ranker models

Pipelines as tools for an Agent: Package advanced RAG as a tool for an autonomous Agent

Chapter 5: Haystack pipeline development with custom components

Extensibility & Testing

Component SDK: Creating custom Haystack components with proper interfaces

Knowledge Graph Integration: Building components for structured knowledge representation

Synthetic Data Generation: Automated test data creation for pipeline validation

Quality Control Systems: Implementing automated evaluation and monitoring components

Unit Testing Frameworks: Comprehensive testing strategies for custom components

Chapter 6: Setting up a reproducible project: naive vs hybrid RAG with reranking and evaluation

Reproducible Workflows & Evaluation

Reproducible Workflow Building Blocks: Setting up consistent environments with Docker, Elasticsearch, and dependency management

Naive RAG Implementation: Building basic retrieval-augmented generation with semantic search

Hybrid RAG with Reranking: Advanced retrieval combining keyword (BM25) and semantic search with rank fusion strategies

Evaluation with RAGAS: Using the RAGAS framework to assess and compare naive vs hybrid RAG system quality across multiple dimensions

Observability with Weights and Biases: Implementing monitoring and tracking for RAG system performance comparison and experiment management

Performance Optimization through Feedback Loops: Creating iterative improvement cycles using evaluation results to enhance retrieval and generation performance

Chapter 7: Production deployment strategies

Deployment & Scaling

Deploying a Retriever Pipeline as FastAPI App

FastAPI REST API: Building production-ready APIs with clean documentation and error handling

Docker Containerization: Full containerization with Docker Compose for scalable deployments

Elasticsearch Integration: Production-grade document storage and hybrid search capabilities

Local Development Workflows: Script-based development environment setup and testing

Deploying Multiple Pipelines with Hayhooks

Hayhooks Framework: Multi-pipeline deployment using Haystack's native REST API framework

Pipeline Orchestration: Managing multiple RAG pipelines (indexing + querying) as microservices

Service Discovery: Automated API endpoint generation and pipeline management

Chapter 8: Hands-on projects

Real-World Applications & Multi-Agent Systems

Hands-on projects that progress from beginner to advanced complexity, focusing on Named Entity Recognition, Text Classification, and Multi-Agent Systems. Projects includes complete notebooks with custom component definition, pipeline definition, and pipeline serialization.

Named Entity Recognition (NER) - Beginner

Haystack Pipeline Fundamentals: Building basic pipelines for entity extraction workflows

Pre-trained NER Models: Using transformer models to identify people, organizations, and locations

Custom Component Creation: Developing reusable components for text processing

Web Content Processing: Building pipelines that extract entities from web search results

SuperComponents and Agents: Wrapping pipelines as tools and building agents for natural language interaction

Text Classification & Sentiment Analysis - Intermediate

Zero-Shot Classification: Categorizing content without training data using LLMs

External API Integration: Connecting Haystack pipelines with the Yelp API

Model Performance Evaluation: Assessing classification accuracy on labeled datasets

Sentiment Analysis Pipelines: Building custom components for analyzing review sentiment

Haystack Agent Mini Project: Hands-on exercise combining NER and classification pipelines with agent orchestration and Hayhooks deployment

Yelp Navigator - Multi-Agent System - Advanced

Pipeline Chaining: Connecting multiple specialized Haystack pipelines into complex workflows

Hayhooks Deployment: Deploying pipelines as REST API endpoints for agent consumption

LangGraph Multi-Agent Orchestration: Building intelligent supervisor systems that coordinate specialized agents

Modular Pipeline Architecture: Creating 4 specialized pipelines (business search, details, sentiment, reporting)

Ambiguous Input Handling: Using NER and intelligent routing to process natural language queries

Distributed Data Aggregation: Generating comprehensive reports from multiple data sources

Optional - Advanced LangGraph Supervisor Patterns for Production

This folder contains an extended, production-grade implementation of the agentic supervisor described in Chapter 8.

Three Agent Architectures: Progressive implementations from learning (V1 monolithic) to production-ready (V3 with checkpointing)

State Management Patterns: Understanding how architectural decisions impact token usage and cost (16-50% reduction)

Monolithic vs Supervisor Patterns: Comparing design approaches with automated token measurement tools

Production Features: Error handling with retry policies, conversation persistence with checkpointing, and graceful degradation

Guardrails: Input validation with prompt injection detection and PII sanitization for secure agent interactions

Checkpointing Systems: Thread-based session management with both in-memory (development) and SQLite (production) persistence options

Name		Name	Last commit message	Last commit date
Latest commit History 548 Commits
.github/workflows		.github/workflows
ch2		ch2
ch3		ch3
ch4		ch4
ch5		ch5
ch6		ch6
ch7-hayhooks		ch7-hayhooks
ch7		ch7
ch8		ch8
epilogue-advanced		epilogue-advanced
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
agentarchitecture.png		agentarchitecture.png
microservicearchitecture.png		microservicearchitecture.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Building RAG and Agentic Applications with LangChain/LangGraph 1.0 and Haystack 2.0

Table of Contents

What You'll Learn to Build

Setting up

Chapter breakdown

Chapter 2: Diving Deep into Large Language Models

Chapter 3: Introduction to Haystack

Chapter 4: Bringing components together: Haystack pipelines for different use cases

Chapter 5: Haystack pipeline development with custom components

Chapter 6: Setting up a reproducible project: naive vs hybrid RAG with reranking and evaluation

Chapter 7: Production deployment strategies

Deploying a Retriever Pipeline as FastAPI App

Deploying Multiple Pipelines with Hayhooks

Chapter 8: Hands-on projects

Named Entity Recognition (NER) - Beginner

Text Classification & Sentiment Analysis - Intermediate

Yelp Navigator - Multi-Agent System - Advanced

Optional - Advanced LangGraph Supervisor Patterns for Production

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages


Microservice architecture	Multi-agent system using microservices

License

PacktPublishing/Building-Natural-Language-Pipelines

Folders and files

Latest commit

History

Repository files navigation

Building RAG and Agentic Applications with LangChain/LangGraph 1.0 and Haystack 2.0

Table of Contents

What You'll Learn to Build

Setting up

Chapter breakdown

Chapter 7: Production deployment strategies

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages