Self-directed Jupyter notebooks for engineers evaluating LangSmith during a POC. The modules cover the full agent engineering loop — build, trace, evaluate, deploy, and surface failure modes — against a single example agent.
| # | Module | Notebook | Duration |
|---|---|---|---|
| 00 | Setup — env, keys, service verification | modules/00_setup.ipynb |
~10 min |
| 01 | Build a Deep Agent — harness, tools, subagents, backends, middleware, HITL, AGENTS.md, skills (optional) | modules/01_build_a_deep_agent_optional.ipynb |
~45 min |
| 02 | Tracing — generate traces and query them with list_runs + filter DSL |
modules/02_tracing.ipynb |
~20 min |
| 03 | Finding Failure Modes — Chat, Insights Agent, and Engine | modules/03_finding_failure_modes.ipynb |
~30 min |
| 04 | Datasets and Experiments — offline evaluation: final-response, single-step, trajectory | modules/04_datasets_and_experiments.ipynb |
~30 min |
| 05 | Online Evaluations — LLM-as-judge run rules that score new traces automatically | modules/05_online_evals.ipynb |
~25 min |
| 06 | Annotation Queues — route low-scoring runs to human review | modules/06_annotation_queues.ipynb |
~20 min |
| 07 | Deploy + Govern — apply workspace-level gateway policies and ship the agent via LangSmith Deployments (optional) | modules/07_deploy_and_govern_optional.ipynb |
~25 min |
Modules are designed to run in order. The full sequence is ~3.5 hours; the required-only path (skipping 01 and 07) is ~2 hours.
Optional modules are tagged _optional in the filename:
- Module 01 introduces the
deepagentsframework from scratch. Skip if already familiar with custom tools, subagents, and prompts. - Module 07 covers deployment via LangSmith. Skip if you don't have deployment permissions or are using LangSmith strictly for observability and evaluations.
The remaining modules form the core observability + evaluation loop.
- Python 3.11+
- uv (recommended) or pip
- A LangSmith account (sign up)
- An API key from your model provider (Anthropic by default; OpenAI, Azure OpenAI, and AWS Bedrock are also supported — see Switching Models below)
- A Tavily API key for the web search tool (get one)
Module 00 walks through this end-to-end with verification cells. The short version:
# 1. Install dependencies
uv sync
# 2. Create your .env file
cp .env.example .env
# Edit .env and fill in your keys
# 3. Start Jupyter
uv run jupyter notebookThen open modules/00_setup.ipynb and run the cells in order to verify Python, dependencies, and credentials.
| Key | Required for | Where to get one |
|---|---|---|
ANTHROPIC_API_KEY |
Modules 01–07 (default model provider) | https://console.anthropic.com |
LANGSMITH_API_KEY |
All modules (tracing + evaluations) | https://smith.langchain.com |
TAVILY_API_KEY |
Modules 01–06 (web search tool used by the agent) | https://tavily.com |
Module 06 (Deploy) additionally requires a LangSmith service key (lsv2_sk_...), not a personal access token, for deployment permissions.
All modules import model from utils/models.py. Change one line there to swap providers — no notebook edits required.
# utils/models.py
# Anthropic (default)
model = init_chat_model("anthropic:claude-sonnet-4-6")
# OpenAI
# model = init_chat_model("openai:gpt-4.1-mini")
# Azure OpenAI
# from langchain_openai import AzureChatOpenAI
# model = AzureChatOpenAI(azure_deployment="gpt-4.1-mini", streaming=True)
# AWS Bedrock
# from langchain_aws import ChatBedrockConverse
# model = ChatBedrockConverse(provider="anthropic", model_id="...")Then set the matching API key environment variable in .env. See .env.example for the full set of supported provider variables.
Module 07 covers two things: wiring up the LangSmith LLM Gateway with a workspace-level PII/secrets policy, then deploying the governed agent to LangSmith Deployments using the langgraph CLI (installed by uv sync). The deploy config is langgraph.json at the repo root. Two graphs are registered: client_research (the primary deployable) and base_research_agent (a second example for inspection).
Your LANGSMITH_API_KEY must have deployment permissions — use a service key (lsv2_sk_...), not a personal access token. The gateway sections require LANGSMITH_API_KEY_GATEWAY (same value) and WORKSPACE_ID — see .env.example.
langsmith-guided-tour/
├── README.md (this file)
├── pyproject.toml (shared dependencies)
├── .env.example
├── langgraph.json (registers deployable graphs)
├── utils/
│ ├── config.py (active agent + project name — single source of truth)
│ ├── models.py (model initialization — swap providers here)
│ ├── search.py (resilient Tavily wrapper with canned fallbacks)
│ └── langsmith_rules.py (helpers for run rules + annotation queues)
├── agents/
│ ├── client_research_agent.py (eval-safe agent imported by Modules 02–05 via utils.config)
│ └── deployable_agents/
│ ├── client_research/ (deployable variant — AGENTS.md, skills, CompositeBackend)
│ │ ├── agent.py
│ │ ├── AGENTS.md
│ │ ├── deepagents.toml
│ │ └── skills/
│ │ ├── client-brief/SKILL.md
│ │ └── portfolio-update/SKILL.md
│ └── base_research_agent/ (second deployable, kept as reference)
│ ├── agent.py
│ ├── AGENTS.md
│ ├── deepagents.toml
│ └── skills/
├── images/ (diagrams + screenshots referenced by the notebooks)
├── modules/
│ ├── 00_setup.ipynb
│ ├── 01_build_a_deep_agent_optional.ipynb
│ ├── 02_tracing.ipynb
│ ├── 03_finding_failure_modes.ipynb
│ ├── 04_datasets_and_experiments.ipynb
│ ├── 05_online_evals.ipynb
│ ├── 06_annotation_queues.ipynb
│ └── 07_deploy_and_govern_optional.ipynb
└── skills/
└── customize-poc/ (Claude Code skill for adapting this repo to a new domain)
├── SKILL.md
└── notebook-customization-guide.md
The repo ships specialized for a client research use case. To adapt it for a different industry or use case, the customize-poc skill at skills/customize-poc/ walks a coding agent (Claude Code, for example) through seven structured discovery questions, then executes the end-to-end customization across the agent code, configuration, and all eight notebook modules.
-
Clone the repo:
git clone https://github.com/langchain-samples/langsmith-guided-tour.git cd langsmith-guided-tour -
Create a branch for your variant. Use the
examples/<vertical>naming convention (e.g.,examples/insurance-claims,examples/legal-contracts):git checkout -b examples/<your-vertical>
-
Open the repo in a coding agent and invoke the
customize-pocskill. The skill auto-loads from.claude/skills/customize-poc/in any Claude Code session opened on this repo — start the session, then ask the agent to invokecustomize-poc. -
Answer the discovery questions. The skill asks seven structured follow-ups one at a time (persona, tools, demo data, example queries, eval criteria, deployable identity, skills). Three approval checkpoints — after the spec, after the agent code, before the dataset — catch misunderstandings before they propagate through the notebooks.
-
Review the output. The skill runs validation at the end (import probes, notebook syntax checks, residual-content greps). Spot-check a few notebook cells for tone and accuracy before committing.
-
Commit and push:
git add -A git commit -m "Add <your-vertical> variant" git push origin examples/<your-vertical>
To contribute a new example back to the samples repo, open a PR against main. To keep the variant private (customer-specific work, internal POCs), fork this repo into your own org first and push the branch there.
langgraph deploy fails with 403 / permission denied
Your API key is a personal access token. Generate a service key (lsv2_sk_...) in LangSmith Settings → Organizations → Access and Security → API Keys.
Notebook can't find utils / agents
Each module's setup cell prepends the repo root to sys.path. If you moved a notebook, update the Path().resolve().parent line to point at the repo root.
Anthropic API: tool_use ids were found without tool_result blocks immediately after
This appears if you submit a regular message to the deployed agent in Studio while a HITL interrupt is pending. The deployable variant in this repo ships without HITL — but if you re-add interrupt_on={...} to agents/deployable_agents/client_research/agent.py, send the resume command as a Command(resume=...) payload rather than plain text.
Chat (Module 07) unavailable
The in-workspace AI assistant requires a model provider API key configured as a workspace secret in LangSmith Settings. Configure one before invoking Chat with Cmd+I / Ctrl+I.