Skip to content

Commit 222e876

Browse files
Elvis Saraviaclaude
authored andcommitted
Add Context Engineering Deep Dive and Deep Agents guides
- Add new Context Engineering Deep Dive article with detailed system prompt engineering techniques - Add Deep Agents guide covering planning, orchestrator architecture, and verification - Fix typos in Deep Agents article (orchestator -> orchestrator, underpsecified -> underspecified) - Add agent architecture images for customer support system - Update navigation metadata to include new articles 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent ac4f737 commit 222e876

9 files changed

+365
-1
lines changed
125 KB
Loading
128 KB
Loading

img/agents/cs-planning.png

154 KB
Loading

img/agents/cs-subagents.png

167 KB
Loading
134 KB
Loading
180 KB
Loading

pages/agents/_meta.en.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,7 @@
22
"introduction": "Introduction to Agents",
33
"components": "Agent Components",
44
"ai-workflows-vs-ai-agents": "AI Workflows vs AI Agents",
5-
"context-engineering": "Context Engineering for AI Agents"
5+
"context-engineering": "Context Engineering for AI Agents",
6+
"context-engineering-deep-dive": "Context Engineering Deep Dive",
7+
"deep-agents": "Deep Agents"
68
}
Lines changed: 284 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
# Context Engineering Deep Dive: Building a Deep Research Agent
2+
3+
import { Callout } from 'nextra/components'
4+
5+
[Context engineering](https://www.promptingguide.ai/guides/context-engineering-guide) requires significant iteration and careful design decisions to build reliable AI agents. This guide takes a deep dive into the practical aspects of context engineering through the development of a basic deep research agent, exploring some of the techniques and design patterns that improve agent reliability and performance.
6+
7+
<Callout type="info" emoji="📚">
8+
This content is based on our new course ["Building Effective AI Agents with n8n"](https://dair-ai.thinkific.com/courses/agents-with-n8n), which provides comprehensive insights, downloadable templates, prompts, and advanced tips into designing and implementing agentic systems.
9+
</Callout>
10+
11+
## The Reality of Context Engineering
12+
13+
Building effective AI agents requires substantial tuning of system prompts and tool definitions. The process involves spending hours iterating on:
14+
15+
- System prompt design and refinement
16+
- Tool definitions and usage instructions
17+
- Agent architecture and communication patterns
18+
- Input/output specifications between agents
19+
20+
21+
Don't underestimate the effort required for context engineering. It's not a one-time task but an iterative process that significantly impacts agent reliability and performance.
22+
23+
## Agent Architecture Design
24+
25+
### The Original Design Problem
26+
27+
![deep-research-agent](../../img/agents/simple-dr-agent.png)
28+
29+
Let's look at a basic deep research agent architecture. The initial architecture connects the web search tool directly to the deep research agent. This design places too much burden on a single agent responsible for:
30+
31+
- Managing tasks (creating, updating, deleting)
32+
- Saving information to memory
33+
- Executing web searches
34+
- Generating final reports
35+
36+
**Consequences of this design:**
37+
- Context grew too long
38+
- Agent forgot to execute web searches
39+
- Task completion updates were missed
40+
- Unreliable behavior across different queries
41+
42+
### The Improved Multi-Agent Architecture
43+
44+
The solution involved separating concerns by introducing a dedicated search worker agent:
45+
46+
**Benefits of the multi-agent design:**
47+
48+
1. **Separation of Concerns**: The parent agent (Deep Research Agent) handles planning and orchestration, while the search worker agent focuses exclusively on executing web searches
49+
2. **Improved Reliability**: Each agent has a clear, focused responsibility, reducing the likelihood of missed tasks or forgotten operations
50+
3. **Model Selection Flexibility**: Different agents can use different language models optimized for their specific tasks
51+
- Deep Research Agent: Uses Gemini 2.5 Pro for complex planning and reasoning
52+
- Search Worker Agent: Uses Gemini 2.5 Flash for faster, more cost-effective search execution
53+
54+
If you are using models from other providers like OpenAI, you can leverage GPT-5 (for planning and reasoning) and GPT-5-mini (for search execution) for similar performance.
55+
56+
<Callout type="info" emoji="💡">
57+
**Design Principle**: Separating agent responsibilities improves reliability and enables cost-effective model selection for different subtasks.
58+
</Callout>
59+
60+
## System Prompt Engineering
61+
62+
Here is the full system prompt for the deep research agent we built in n8n:
63+
64+
```md
65+
You are a deep research agent who will help with planning and executing search tasks to generate a deep research report.
66+
67+
## GENERAL INSTRUCTIONS
68+
69+
The user will provide a query, and you will convert that query into a search plan with multiple search tasks (3 web searches). You will execute each search task and maintain the status of those searches in a spreadsheet.
70+
71+
You will then generate a final deep research report for the user.
72+
73+
For context, today's date is: {{ $now.format('yyyy-MM-dd') }}
74+
75+
## TOOL DESCRIPTIONS
76+
77+
Below are some useful instructions for how to use the available tools.
78+
79+
Deleting tasks: Use the delete_task tool to clear up all the tasks before starting the search plan.
80+
81+
Planning tasks: You will create a plan with the search tasks (3 web searches) and add them to the Google Sheet using the append_update_task tool. Make sure to keep the status of each task updated after completing each search. Each task begins with a todo status and will be updated to a "done" status once the search worker returns information regarding the search task.
82+
83+
Executing tasks: Use the Search Worker Agent tool to execute the search plan. The input to the agent are the actual search queries, word for word.
84+
85+
Use the tools in the order that makes the most sense to you but be efficient.
86+
```
87+
88+
Let's break it down into parts and discuss why each section is important:
89+
90+
91+
### High-Level Agent Definition
92+
93+
The system prompt begins with a clear definition of the agent's role:
94+
95+
```md
96+
You are a deep research agent who will help with planning and executing search tasks to generate a deep research report.
97+
```
98+
99+
### General Instructions
100+
101+
Provide explicit instructions about the agent's workflow:
102+
103+
```md
104+
## GENERAL INSTRUCTIONS
105+
106+
The user will provide a query, and you will convert that query into a search plan with multiple search tasks (3 web searches). You will execute each search task and maintain the status of those searches in a spreadsheet.
107+
108+
You will then generate a final deep research report for the user.
109+
```
110+
111+
### Providing Essential Context
112+
113+
**Current Date Information:**
114+
115+
Including the current date is crucial for research agents to get up-to-date information:
116+
117+
```md
118+
For context, today's date is: {{ $now.format('yyyy-MM-dd') }}
119+
```
120+
121+
**Why this matters:**
122+
- LLMs typically have knowledge cutoffs months or years behind the current date
123+
- Without current date context, agents often search for outdated information
124+
- This ensures agents understand temporal context for queries like "latest news" or "recent developments"
125+
126+
In n8n, you can dynamically inject the current date using built-in functions with customizable formats (date only, date with time, specific timezones, etc.).
127+
128+
## Tool Definitions and Usage Instructions
129+
130+
### The Importance of Detailed Tool Descriptions
131+
132+
Tool definitions typically appear in two places:
133+
134+
1. **In the system prompt**: Detailed explanations of what tools do and when to use them
135+
2. **In the actual tool implementation**: Technical specifications and parameters
136+
137+
<Callout type="warning" emoji="🎯">
138+
**Key Insight**: The biggest performance improvements often come from clearly explaining tool usage in the system prompt, not just defining tool parameters.
139+
</Callout>
140+
141+
### Example Tool Instructions
142+
143+
The system prompt also includes detailed instructions for using the available tools:
144+
145+
```md
146+
## TOOL DESCRIPTIONS
147+
148+
Below are some useful instructions for how to use the available tools.
149+
150+
Deleting tasks: Use the delete_task tool to clear up all the tasks before starting the search plan.
151+
152+
Planning tasks: You will create a plan with the search tasks (3 web searches) and add them to the Google Sheet using the append_update_task tool. Make sure to keep the status of each task updated after completing each search. Each task begins with a todo status and will be updated to a "done" status once the search worker returns information regarding the search task.
153+
154+
Executing tasks: Use the Search Worker Agent tool to execute the search plan. The input to the agent are the actual search queries, word for word.
155+
156+
Use the tools in the order that makes the most sense to you but be efficient.
157+
```
158+
159+
160+
Initially, without explicit status definitions, the agent would use different status values across runs:
161+
- Sometimes "pending", sometimes "to-do"
162+
- Sometimes "completed", sometimes "done", sometimes "finished"
163+
164+
Be explicit about allowed values. This eliminates ambiguity and ensures consistent behavior.
165+
166+
Note that the system prompt also includes this instruction:
167+
168+
```md
169+
Use the tools in the order that makes most sense to you, but be efficient.
170+
```
171+
172+
What's the reasoning behind this decision?
173+
174+
This provides flexibility for the agent to optimize its execution strategy. During testing, the agent might:
175+
- Execute only 2 searches instead of 3 if it determines that's sufficient
176+
- Combine redundant search queries
177+
- Skip searches that overlap significantly
178+
179+
Here is a specific instruction you can use, if you require all search tasks to be executed:
180+
181+
```md
182+
You MUST execute a web search for each and every search task you create.
183+
Do NOT skip any tasks, even if they seem redundant.
184+
```
185+
186+
**When to use flexible vs. rigid approaches:**
187+
- **Flexible**: During development and testing to observe agent decision-making patterns
188+
- **Rigid**: In production when consistency and completeness are critical
189+
190+
## Context Engineering Iteration Process
191+
192+
### The Iterative Nature of Improving Context
193+
194+
Context engineering is not a one-time effort. The development process involves:
195+
196+
1. **Initial implementation** with basic system prompts
197+
2. **Testing** with diverse queries
198+
3. **Identifying issues** (missed tasks, wrong status values, incomplete searches)
199+
4. **Adding specific instructions** to address each issue
200+
5. **Re-testing** to validate improvements
201+
6. **Repeating** the cycle
202+
203+
### What's Still Missing
204+
205+
Even after multiple iterations, there are opportunities for further improvement:
206+
207+
**Search Task Metadata:**
208+
- Augmenting search queries
209+
- Search type (web search, news search, academic search, PDF search)
210+
- Time period filters (today, last week, past month, past year, all time)
211+
- Domain focus (technology, science, health, etc.)
212+
- Priority levels for task execution order
213+
214+
**Enhanced Search Planning:**
215+
- More detailed instructions on how to generate search tasks
216+
- Preferred formats for search queries
217+
- Guidelines for breaking down complex queries
218+
- Examples of good vs. bad search task decomposition
219+
220+
**Date Range Specification:**
221+
- Start date and end date for time-bounded searches
222+
- Format specifications for date parameters
223+
- Logic for inferring date ranges from time period keywords
224+
225+
Based on the recommended improvements, it's easy to appreciate that web search for AI agents is a challenging effort that requires a lot of context engineering.
226+
227+
228+
## Advanced Considerations
229+
230+
### Sub-Agent Communication
231+
232+
When designing multi-agent systems, carefully consider:
233+
234+
**What information does the sub-agent need?**
235+
- For the search worker: Just the search query text
236+
- Not the full context or task metadata
237+
- Keep sub-agent inputs minimal and focused
238+
239+
**What information should the sub-agent return?**
240+
- Search results and relevant findings
241+
- Error states or failure conditions
242+
- Metadata about the search execution
243+
244+
### Context Length Management
245+
246+
As agents execute multiple tasks, context grows:
247+
- Task history accumulates
248+
- Search results add tokens
249+
- Conversation history expands
250+
251+
**Strategies to manage context length:**
252+
- Use separate agents to isolate context
253+
- Implement memory management tools
254+
- Summarize long outputs before adding to context
255+
- Clear task lists between research queries
256+
257+
### Error Handling in System Prompts
258+
259+
Include instructions for failure scenarios:
260+
261+
```text
262+
ERROR HANDLING:
263+
- If search_worker fails, retry once with rephrased query
264+
- If task cannot be completed, mark status as "failed" with reason
265+
- If critical errors occur, notify user and request guidance
266+
- Never proceed silently when operations fail
267+
```
268+
269+
## Conclusion
270+
271+
Context engineering is a critical practice for building reliable AI agents that requires:
272+
273+
- **Significant iteration time** spent tuning prompts and tool definitions
274+
- **Careful architectural decisions** about agent separation and communication
275+
- **Explicit instructions** that eliminate assumptions
276+
- **Continuous refinement** based on observed behavior
277+
- **Balance between flexibility and control**
278+
279+
The deep research agent example demonstrates how thoughtful context engineering transforms an unreliable prototype into a robust, production-ready system. By applying these principles—clear role definitions, explicit tool instructions, essential context provision, and iterative improvement—you can build AI agents that consistently deliver high-quality results.
280+
281+
<Callout type="info" emoji="🎓">
282+
Learn how to build production-ready AI agents with hands-on examples and templates. [Join our comprehensive course!](https://dair-ai.thinkific.com/courses/agents-with-n8n)
283+
Use code PROMPTING20 to get an extra 20% off.
284+
</Callout>

pages/agents/deep-agents.en.mdx

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Deep Agents
2+
3+
import { Callout } from 'nextra/components'
4+
5+
Most agents today are shallow.
6+
7+
They easily break down on long, multi-step problems (e.g., deep research or agentic coding).
8+
9+
That’s changing fast!
10+
11+
We’re entering the era of "Deep Agents", systems that strategically plan, remember, and delegate intelligently for solving very complex problems.
12+
13+
We at the [DAIR.AI Academy](https://dair-ai.thinkific.com/) and other folks from [LangChain](https://docs.langchain.com/labs/deep-agents/overview), [Claude Code](https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk), as well as more recently, individuals like [Philipp Schmid](https://www.philschmid.de/agents-2.0-deep-agents), have been documenting this idea.
14+
15+
Here is an example of a deep agent built to power the [DAIR.AI Academy's](https://dair-ai.thinkific.com/) customer support system intended for students to ask questions regarding our trainings and courses:
16+
17+
![deep-agent](../../img/agents/customer-support-deep-agent.png)
18+
19+
<Callout type="info" emoji="📚">
20+
This post is based on our new course ["Building Effective AI Agents with n8n"](https://dair-ai.thinkific.com/courses/agents-with-n8n), which provides comprehensive insights, downloadable templates, prompts, and advanced tips into designing and implementing deep agents.
21+
</Callout>
22+
23+
Here’s roughly the core idea behind Deep Agents (based on my own thoughts and notes that I've gathered from others):
24+
25+
## Planning
26+
27+
![cs-planning](../../img/agents/cs-planning.png)
28+
29+
Instead of reasoning ad-hoc inside a single context window, Deep Agents maintain structured task plans they can update, retry, and recover from. Think of it as a living to-do list that guides the agent toward its long-term goal. To experience this, just try out Claude Code or Codex for planning; the results are significantly better once you enable it before executing any task.
30+
31+
We have also written recently on the power of brainstorming for longer with Claude Code, and this shows the power of planning, expert context, and human-in-the-loop (your expertise gives you an important edge when working with deep agents). Planning will also be critical for long-horizon problems (think agents for scientific discovery, which comes next).
32+
33+
## Orchestrator & Sub-agent Architecture
34+
35+
![cs-subagents](../../img/agents/cs-subagents.png)
36+
37+
One big agent (typically with a very long context) is no longer enough. I've seen [arguments](https://cognition.ai/blog/dont-build-multi-agents) against multi-agent systems and in favor of monolithic systems, but I'm skeptical about this.
38+
39+
The orchestrator-sub-agent architecture is one of the most powerful LLM-based agentic architectures you can leverage today for any domain you can imagine. An orchestrator manages specialized sub-agents such as search agents, coders, KB retrievers, analysts, verifiers, and writers, each with its own clean context and domain focus.
40+
41+
The orchestrator delegates intelligently, and subagents execute efficiently. The orchestrator integrates their outputs into a coherent result. Claude Code popularized the use of this approach for coding and sub-agents, which, it turns out, are particularly useful for efficiently managing context (through separation of concerns).
42+
43+
I wrote a few notes on the power of using orchestrator and subagents [here](https://x.com/omarsar0/status/1960877597191245974) and [here](https://x.com/omarsar0/status/1971975884077965783).
44+
45+
## Context Retrieval and Agentic Search
46+
47+
![persistent-storage](../../img/agents/cs-persistent-storage.png)
48+
49+
Deep Agents don’t rely on conversation history alone. They store intermediate work in external memory like files, notes, vectors, or databases, letting them reference what matters without overloading the model’s context. High-quality structured memory is a thing of beauty.
50+
51+
Take a look at recent works like [ReasoningBank](https://arxiv.org/abs/2509.25140) and [Agentic Context Engineering](https://arxiv.org/abs/2510.04618) for some really cool ideas on how to better optimize memory building and retrieval. Building with the orchestrator-subagents architecture means that you can also leverage hybrid memory techniques (e.g., agentic search + semantic search), and you can let the agent decide what strategy to use.
52+
53+
## Context Engineering
54+
55+
One of the worst things you can do when interacting with these types of agents is underspecified instructions/prompts. Prompt engineering was and is important, but we will use the new term [context engineering](https://www.promptingguide.ai/guides/context-engineering-guide) to emphasize the importance of building context for agents. The instructions need to be more explicit, detailed, and intentional to define when to plan, when to use a sub-agent, how to name files, and how to collaborate with humans. Part of context engineering also involves efforts around structured outputs, system prompt optimization, compacting context, evaluating context effectiveness, and [optimizing tool definitions](https://www.anthropic.com/engineering/writing-tools-for-agents).
56+
57+
Read our previous guide on context engineering to learn more: [Context Engineering Deep Dive](https://www.promptingguide.ai/guides/context-engineering-guide)
58+
59+
## Verification
60+
61+
![verification agent](../../img/agents/cs-verification-agent.png)
62+
63+
Next to context engineering, verification is one of the most important components of an agentic system (though less often discussed). Verification boils down to verifying outputs, which can be automated (LLM-as-a-Judge) or done by a human. Because of the effectiveness of modern LLMs at generating text (in domains like math and coding), it's easy to forget that they still suffer from hallucination, sycophancy, prompt injection, and a number of other issues. Verification helps with making your agents more reliable and more production-ready. You can build good verifiers by leveraging systematic evaluation pipelines.
64+
65+
## Final Words
66+
67+
This is a huge shift in how we build with AI agents. Deep agents also feel like an important building block for what comes next: personalized proactive agents that can act on our behalf. I will write more on proactive agents in a future post.
68+
69+
I've been teaching these ideas to agent builders over the past couple of months. If you are interested in more hands-on experience for how to build deep agents check out the new course in our academy: https://dair-ai.thinkific.com/courses/agents-with-n8n
70+
71+
72+
The figures you see in the post describe an agentic RAG system that students need to build for the course final project.
73+
74+
<Callout type="info" emoji="📚">
75+
This post is based on our new course ["Building Effective AI Agents with n8n"](https://dair-ai.thinkific.com/courses/agents-with-n8n), which provides comprehensive insights, downloadable templates, prompts, and advanced tips into designing and implementing deep agents.
76+
</Callout>
77+
78+
*Written by Elvis Saravia (creator of the Prompting Engineering Guide and co-founder of the DAIR.AI Academy)*

0 commit comments

Comments
 (0)