|
| 1 | +# Context Engineering Deep Dive: Building a Deep Research Agent |
| 2 | + |
| 3 | +import { Callout } from 'nextra/components' |
| 4 | + |
| 5 | +[Context engineering](https://www.promptingguide.ai/guides/context-engineering-guide) requires significant iteration and careful design decisions to build reliable AI agents. This guide takes a deep dive into the practical aspects of context engineering through the development of a basic deep research agent, exploring some of the techniques and design patterns that improve agent reliability and performance. |
| 6 | + |
| 7 | +<Callout type="info" emoji="📚"> |
| 8 | +This content is based on our new course ["Building Effective AI Agents with n8n"](https://dair-ai.thinkific.com/courses/agents-with-n8n), which provides comprehensive insights, downloadable templates, prompts, and advanced tips into designing and implementing agentic systems. |
| 9 | +</Callout> |
| 10 | + |
| 11 | +## The Reality of Context Engineering |
| 12 | + |
| 13 | +Building effective AI agents requires substantial tuning of system prompts and tool definitions. The process involves spending hours iterating on: |
| 14 | + |
| 15 | +- System prompt design and refinement |
| 16 | +- Tool definitions and usage instructions |
| 17 | +- Agent architecture and communication patterns |
| 18 | +- Input/output specifications between agents |
| 19 | + |
| 20 | + |
| 21 | +Don't underestimate the effort required for context engineering. It's not a one-time task but an iterative process that significantly impacts agent reliability and performance. |
| 22 | + |
| 23 | +## Agent Architecture Design |
| 24 | + |
| 25 | +### The Original Design Problem |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +Let's look at a basic deep research agent architecture. The initial architecture connects the web search tool directly to the deep research agent. This design places too much burden on a single agent responsible for: |
| 30 | + |
| 31 | +- Managing tasks (creating, updating, deleting) |
| 32 | +- Saving information to memory |
| 33 | +- Executing web searches |
| 34 | +- Generating final reports |
| 35 | + |
| 36 | +**Consequences of this design:** |
| 37 | +- Context grew too long |
| 38 | +- Agent forgot to execute web searches |
| 39 | +- Task completion updates were missed |
| 40 | +- Unreliable behavior across different queries |
| 41 | + |
| 42 | +### The Improved Multi-Agent Architecture |
| 43 | + |
| 44 | +The solution involved separating concerns by introducing a dedicated search worker agent: |
| 45 | + |
| 46 | +**Benefits of the multi-agent design:** |
| 47 | + |
| 48 | +1. **Separation of Concerns**: The parent agent (Deep Research Agent) handles planning and orchestration, while the search worker agent focuses exclusively on executing web searches |
| 49 | +2. **Improved Reliability**: Each agent has a clear, focused responsibility, reducing the likelihood of missed tasks or forgotten operations |
| 50 | +3. **Model Selection Flexibility**: Different agents can use different language models optimized for their specific tasks |
| 51 | + - Deep Research Agent: Uses Gemini 2.5 Pro for complex planning and reasoning |
| 52 | + - Search Worker Agent: Uses Gemini 2.5 Flash for faster, more cost-effective search execution |
| 53 | + |
| 54 | +If you are using models from other providers like OpenAI, you can leverage GPT-5 (for planning and reasoning) and GPT-5-mini (for search execution) for similar performance. |
| 55 | + |
| 56 | +<Callout type="info" emoji="💡"> |
| 57 | +**Design Principle**: Separating agent responsibilities improves reliability and enables cost-effective model selection for different subtasks. |
| 58 | +</Callout> |
| 59 | + |
| 60 | +## System Prompt Engineering |
| 61 | + |
| 62 | +Here is the full system prompt for the deep research agent we built in n8n: |
| 63 | + |
| 64 | +```md |
| 65 | +You are a deep research agent who will help with planning and executing search tasks to generate a deep research report. |
| 66 | + |
| 67 | +## GENERAL INSTRUCTIONS |
| 68 | + |
| 69 | +The user will provide a query, and you will convert that query into a search plan with multiple search tasks (3 web searches). You will execute each search task and maintain the status of those searches in a spreadsheet. |
| 70 | + |
| 71 | +You will then generate a final deep research report for the user. |
| 72 | + |
| 73 | +For context, today's date is: {{ $now.format('yyyy-MM-dd') }} |
| 74 | + |
| 75 | +## TOOL DESCRIPTIONS |
| 76 | + |
| 77 | +Below are some useful instructions for how to use the available tools. |
| 78 | + |
| 79 | +Deleting tasks: Use the delete_task tool to clear up all the tasks before starting the search plan. |
| 80 | + |
| 81 | +Planning tasks: You will create a plan with the search tasks (3 web searches) and add them to the Google Sheet using the append_update_task tool. Make sure to keep the status of each task updated after completing each search. Each task begins with a todo status and will be updated to a "done" status once the search worker returns information regarding the search task. |
| 82 | + |
| 83 | +Executing tasks: Use the Search Worker Agent tool to execute the search plan. The input to the agent are the actual search queries, word for word. |
| 84 | + |
| 85 | +Use the tools in the order that makes the most sense to you but be efficient. |
| 86 | +``` |
| 87 | + |
| 88 | +Let's break it down into parts and discuss why each section is important: |
| 89 | + |
| 90 | + |
| 91 | +### High-Level Agent Definition |
| 92 | + |
| 93 | +The system prompt begins with a clear definition of the agent's role: |
| 94 | + |
| 95 | +```md |
| 96 | +You are a deep research agent who will help with planning and executing search tasks to generate a deep research report. |
| 97 | +``` |
| 98 | + |
| 99 | +### General Instructions |
| 100 | + |
| 101 | +Provide explicit instructions about the agent's workflow: |
| 102 | + |
| 103 | +```md |
| 104 | +## GENERAL INSTRUCTIONS |
| 105 | + |
| 106 | +The user will provide a query, and you will convert that query into a search plan with multiple search tasks (3 web searches). You will execute each search task and maintain the status of those searches in a spreadsheet. |
| 107 | + |
| 108 | +You will then generate a final deep research report for the user. |
| 109 | +``` |
| 110 | + |
| 111 | +### Providing Essential Context |
| 112 | + |
| 113 | +**Current Date Information:** |
| 114 | + |
| 115 | +Including the current date is crucial for research agents to get up-to-date information: |
| 116 | + |
| 117 | +```md |
| 118 | +For context, today's date is: {{ $now.format('yyyy-MM-dd') }} |
| 119 | +``` |
| 120 | + |
| 121 | +**Why this matters:** |
| 122 | +- LLMs typically have knowledge cutoffs months or years behind the current date |
| 123 | +- Without current date context, agents often search for outdated information |
| 124 | +- This ensures agents understand temporal context for queries like "latest news" or "recent developments" |
| 125 | + |
| 126 | +In n8n, you can dynamically inject the current date using built-in functions with customizable formats (date only, date with time, specific timezones, etc.). |
| 127 | + |
| 128 | +## Tool Definitions and Usage Instructions |
| 129 | + |
| 130 | +### The Importance of Detailed Tool Descriptions |
| 131 | + |
| 132 | +Tool definitions typically appear in two places: |
| 133 | + |
| 134 | +1. **In the system prompt**: Detailed explanations of what tools do and when to use them |
| 135 | +2. **In the actual tool implementation**: Technical specifications and parameters |
| 136 | + |
| 137 | +<Callout type="warning" emoji="🎯"> |
| 138 | +**Key Insight**: The biggest performance improvements often come from clearly explaining tool usage in the system prompt, not just defining tool parameters. |
| 139 | +</Callout> |
| 140 | + |
| 141 | +### Example Tool Instructions |
| 142 | + |
| 143 | +The system prompt also includes detailed instructions for using the available tools: |
| 144 | + |
| 145 | +```md |
| 146 | +## TOOL DESCRIPTIONS |
| 147 | + |
| 148 | +Below are some useful instructions for how to use the available tools. |
| 149 | + |
| 150 | +Deleting tasks: Use the delete_task tool to clear up all the tasks before starting the search plan. |
| 151 | + |
| 152 | +Planning tasks: You will create a plan with the search tasks (3 web searches) and add them to the Google Sheet using the append_update_task tool. Make sure to keep the status of each task updated after completing each search. Each task begins with a todo status and will be updated to a "done" status once the search worker returns information regarding the search task. |
| 153 | + |
| 154 | +Executing tasks: Use the Search Worker Agent tool to execute the search plan. The input to the agent are the actual search queries, word for word. |
| 155 | + |
| 156 | +Use the tools in the order that makes the most sense to you but be efficient. |
| 157 | +``` |
| 158 | + |
| 159 | + |
| 160 | +Initially, without explicit status definitions, the agent would use different status values across runs: |
| 161 | +- Sometimes "pending", sometimes "to-do" |
| 162 | +- Sometimes "completed", sometimes "done", sometimes "finished" |
| 163 | + |
| 164 | +Be explicit about allowed values. This eliminates ambiguity and ensures consistent behavior. |
| 165 | + |
| 166 | +Note that the system prompt also includes this instruction: |
| 167 | + |
| 168 | +```md |
| 169 | +Use the tools in the order that makes most sense to you, but be efficient. |
| 170 | +``` |
| 171 | + |
| 172 | +What's the reasoning behind this decision? |
| 173 | + |
| 174 | +This provides flexibility for the agent to optimize its execution strategy. During testing, the agent might: |
| 175 | +- Execute only 2 searches instead of 3 if it determines that's sufficient |
| 176 | +- Combine redundant search queries |
| 177 | +- Skip searches that overlap significantly |
| 178 | + |
| 179 | +Here is a specific instruction you can use, if you require all search tasks to be executed: |
| 180 | + |
| 181 | +```md |
| 182 | +You MUST execute a web search for each and every search task you create. |
| 183 | +Do NOT skip any tasks, even if they seem redundant. |
| 184 | +``` |
| 185 | + |
| 186 | +**When to use flexible vs. rigid approaches:** |
| 187 | +- **Flexible**: During development and testing to observe agent decision-making patterns |
| 188 | +- **Rigid**: In production when consistency and completeness are critical |
| 189 | + |
| 190 | +## Context Engineering Iteration Process |
| 191 | + |
| 192 | +### The Iterative Nature of Improving Context |
| 193 | + |
| 194 | +Context engineering is not a one-time effort. The development process involves: |
| 195 | + |
| 196 | +1. **Initial implementation** with basic system prompts |
| 197 | +2. **Testing** with diverse queries |
| 198 | +3. **Identifying issues** (missed tasks, wrong status values, incomplete searches) |
| 199 | +4. **Adding specific instructions** to address each issue |
| 200 | +5. **Re-testing** to validate improvements |
| 201 | +6. **Repeating** the cycle |
| 202 | + |
| 203 | +### What's Still Missing |
| 204 | + |
| 205 | +Even after multiple iterations, there are opportunities for further improvement: |
| 206 | + |
| 207 | +**Search Task Metadata:** |
| 208 | +- Augmenting search queries |
| 209 | +- Search type (web search, news search, academic search, PDF search) |
| 210 | +- Time period filters (today, last week, past month, past year, all time) |
| 211 | +- Domain focus (technology, science, health, etc.) |
| 212 | +- Priority levels for task execution order |
| 213 | + |
| 214 | +**Enhanced Search Planning:** |
| 215 | +- More detailed instructions on how to generate search tasks |
| 216 | +- Preferred formats for search queries |
| 217 | +- Guidelines for breaking down complex queries |
| 218 | +- Examples of good vs. bad search task decomposition |
| 219 | + |
| 220 | +**Date Range Specification:** |
| 221 | +- Start date and end date for time-bounded searches |
| 222 | +- Format specifications for date parameters |
| 223 | +- Logic for inferring date ranges from time period keywords |
| 224 | + |
| 225 | +Based on the recommended improvements, it's easy to appreciate that web search for AI agents is a challenging effort that requires a lot of context engineering. |
| 226 | + |
| 227 | + |
| 228 | +## Advanced Considerations |
| 229 | + |
| 230 | +### Sub-Agent Communication |
| 231 | + |
| 232 | +When designing multi-agent systems, carefully consider: |
| 233 | + |
| 234 | +**What information does the sub-agent need?** |
| 235 | +- For the search worker: Just the search query text |
| 236 | +- Not the full context or task metadata |
| 237 | +- Keep sub-agent inputs minimal and focused |
| 238 | + |
| 239 | +**What information should the sub-agent return?** |
| 240 | +- Search results and relevant findings |
| 241 | +- Error states or failure conditions |
| 242 | +- Metadata about the search execution |
| 243 | + |
| 244 | +### Context Length Management |
| 245 | + |
| 246 | +As agents execute multiple tasks, context grows: |
| 247 | +- Task history accumulates |
| 248 | +- Search results add tokens |
| 249 | +- Conversation history expands |
| 250 | + |
| 251 | +**Strategies to manage context length:** |
| 252 | +- Use separate agents to isolate context |
| 253 | +- Implement memory management tools |
| 254 | +- Summarize long outputs before adding to context |
| 255 | +- Clear task lists between research queries |
| 256 | + |
| 257 | +### Error Handling in System Prompts |
| 258 | + |
| 259 | +Include instructions for failure scenarios: |
| 260 | + |
| 261 | +```text |
| 262 | +ERROR HANDLING: |
| 263 | +- If search_worker fails, retry once with rephrased query |
| 264 | +- If task cannot be completed, mark status as "failed" with reason |
| 265 | +- If critical errors occur, notify user and request guidance |
| 266 | +- Never proceed silently when operations fail |
| 267 | +``` |
| 268 | + |
| 269 | +## Conclusion |
| 270 | + |
| 271 | +Context engineering is a critical practice for building reliable AI agents that requires: |
| 272 | + |
| 273 | +- **Significant iteration time** spent tuning prompts and tool definitions |
| 274 | +- **Careful architectural decisions** about agent separation and communication |
| 275 | +- **Explicit instructions** that eliminate assumptions |
| 276 | +- **Continuous refinement** based on observed behavior |
| 277 | +- **Balance between flexibility and control** |
| 278 | + |
| 279 | +The deep research agent example demonstrates how thoughtful context engineering transforms an unreliable prototype into a robust, production-ready system. By applying these principles—clear role definitions, explicit tool instructions, essential context provision, and iterative improvement—you can build AI agents that consistently deliver high-quality results. |
| 280 | + |
| 281 | +<Callout type="info" emoji="🎓"> |
| 282 | +Learn how to build production-ready AI agents with hands-on examples and templates. [Join our comprehensive course!](https://dair-ai.thinkific.com/courses/agents-with-n8n) |
| 283 | +Use code PROMPTING20 to get an extra 20% off. |
| 284 | +</Callout> |
0 commit comments