-
Notifications
You must be signed in to change notification settings - Fork 193
Description
Proposed Content
We should include information that some kind of limit should be applied to agents when deployed in production. This could be a timeout, token, or agent loop limit for agents.
Location
Rationale
Give instruction that agents should have clear execution boundaries, and should not be able to run forever.
Content Outline (Optional)
No response
References
related: #189
Implementation Requirements
Based on repository analysis and clarification discussion, this task involves adding documentation about execution limits to the production guide.
Target File
docs/user-guide/deploy/operating-agents-in-production.md
Placement
Integrate into the existing "Performance Optimization" section as a new subsection covering execution limits and safety boundaries.
Content to Add
Execution Limits Subsection
Add conceptual guidance covering the following limit types:
-
Agent Loop Iteration Limits
- Limiting the maximum number of LLM calls per agent invocation
- Reference: Hook-based approach using
BeforeModelCallEvent/AfterModelCallEvent
-
Tool Invocation Limits
- Limiting how many times tools can be called
- Reference: Existing
LimitToolCountshook example in Hooks - Cookbook
-
Token Consumption Budgets
- Model-level
max_tokensconfiguration for response limits - Reference: Agent Loop - Stop Reasons
- Model-level
-
Execution Timeouts
- Wall-clock time limits for agent invocations
- Hook-based or external wrapper approaches
Multi-Agent Safety Mechanisms Reference
Include a subsection referencing built-in safety mechanisms for multi-agent patterns:
- Swarm:
max_handoffs,max_iterations,execution_timeout,node_timeout- Reference: Swarm - Safety Mechanisms
- Graph:
set_max_node_executions(),set_execution_timeout(),set_node_timeout()- Reference: Graph - GraphBuilder
Code Examples
- Conceptual guidance only - point to existing hook examples rather than creating new code
- Include Python examples with TypeScript equivalents where the feature is available
- Use
{{ ts_not_supported_code() }}macro for features not yet available in TypeScript
Documentation Style
- Follow existing documentation patterns in
operating-agents-in-production.md - Use cross-references to existing documentation sections
- Include rationale for why execution limits are important (cost control, preventing infinite loops, resource management)
Files to Modify
docs/user-guide/deploy/operating-agents-in-production.md- Add execution limits subsection
Acceptance Criteria
- New "Execution Limits" subsection added under "Performance Optimization"
- All four limit types covered with conceptual guidance
- Cross-references to existing hook documentation (hooks.md)
- Cross-references to multi-agent safety mechanisms (swarm.md, graph.md)
- Python examples included
- TypeScript equivalents included where available (or macro used for unavailable features)
- Documentation builds successfully (
mkdocs build) - TypeScript validation passes (
npm run test)
Related Issues
- Issue [DOCS] Limit maximum number of LLM calls in the agent loop #189 covers hooks-specific documentation for limiting LLM calls (separate concern, not duplicated here)
Notes for Implementation
- The existing
LimitToolCountshook in hooks.md provides a good pattern to reference - Multi-agent patterns (Swarm/Graph) have built-in limits that should be highlighted as best practices
- Consider adding a brief note about monitoring token usage (ties into existing "Monitoring and Observability" section)