Skip to content

Is capping tool response size inside execute the right pattern? #12348

@daneatmastra

Description

@daneatmastra

This issue was created from Discord post 1465588219184812052:

Open in Discord

We hit context_length_exceeded because one of our tools returns a large JSON array (600k+ chars). The interesting thing is that we do have TokenLimiter(128000) and ToolCallFilter() as inputProcessors, but those didn't help...

If I understand this correctly, I think the reason why they didn't help is because:

  • processInput runs once at the start -- trims history, not current-step tool results
  • processInputStep runs at each step but TokenLimiter drops older messages to make room -- it won't truncate the content of a single oversized tool-result message (since it's the most recent, it gets kept)
  • So the tool result goes straight to the LLM and blows up the context window.

Our workaround: We cap the tool response inside the tool's execute function before returning it. For arrays, we keep whole items that fit within a budget. The budget is derived dynamically from the model's context window.

Is this the right approach? Or is there a built-in Mastra way to handle oversized tool results that I'm missing? Should I be using processInputStep differently to truncate individual message content?

(This is a bit of a priority for us since our agent is live and consistently crashing because of this issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    AgentsIssues regarding Mastra's Agent primitiveToolsIssues with user made tools for Agent tool callingbugSomething isn't workingdiscordFor issues created from Discord discussions.effort:mediumimpact:hightrio-tb

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions