-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Required prerequisites
- I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
- Consider asking first in a Discussion.
Motivation
Background
This is a follow-up to issue #3376 and the reverted PR #3259. We've been exploring a new approach to handle long tool outputs that consume excessive tokens in the agent's context.
Problem
When agents use tools that return very long outputs (e.g., web page snapshots, large file contents, API responses), these outputs
accumulate in memory and significantly increase token consumption in subsequent interactions. The previous automatic caching
approach was reverted due to concerns about agent performance and accuracy.
Proposed Solution
Instead of automatically caching long outputs, we propose adding a new tool that allows the agent to proactively manage long tool
outputs. When the agent recognizes that a previous tool output is excessively long, it can call this tool to:
- Generate a summary of the previous long tool output
- Offload the original content with a unique ID for potential future retrieval
- Collapse/replace the long output in memory with the summary + offload reference
If LLM doing paralle tool call, no additional api request is needed
Why This Approach
| Aspect | Previous Approach (Auto-offload) | New Approach (Agent-driven Tool) |
|---|---|---|
| Control | Automatic based on threshold | Agent decides when to summarize |
| Information Retention | ID reference | Summary + offload original |
| Intelligence | Rule-based | Context-aware decision by agent |
| Flexibility | Fixed behavior | Agent can specify what to focus on in summary |
Implementation Considerations
- Modify memory management to support replacing/collapsing previous tool outputs
- Add a companion retrieve_ offloaded_output tool for accessing original data when needed
- Well designed docstring to guide agents on when to use this tool
Related
- Issue: [Enhance] Revive Tool Output Offloading #3376
- Reverted PR: feat: add tool call caching for chatAgent #3259
Solution
No response
Alternatives
No response
Additional context
No response