-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
This issue was created from Discord post 1465588219184812052:
We hit context_length_exceeded because one of our tools returns a large JSON array (600k+ chars). The interesting thing is that we do have TokenLimiter(128000) and ToolCallFilter() as inputProcessors, but those didn't help...
If I understand this correctly, I think the reason why they didn't help is because:
processInputruns once at the start -- trims history, not current-step tool resultsprocessInputStepruns at each step butTokenLimiterdrops older messages to make room -- it won't truncate the content of a single oversized tool-result message (since it's the most recent, it gets kept)- So the tool result goes straight to the LLM and blows up the context window.
Our workaround: We cap the tool response inside the tool's execute function before returning it. For arrays, we keep whole items that fit within a budget. The budget is derived dynamically from the model's context window.
Is this the right approach? Or is there a built-in Mastra way to handle oversized tool results that I'm missing? Should I be using processInputStep differently to truncate individual message content?
(This is a bit of a priority for us since our agent is live and consistently crashing because of this issue)