Problem
Within a single processQuery() call, two vectors grow unbounded:
std::vector<json> stepResults — accumulates every step's JSON result
std::vector<std::pair<std::string, json>> toolCallHistory — tracks every tool call and its arguments
While maxSteps provides an indirect cap, each step can produce large JSON payloads (e.g., tool results with verbose command output). A 50-step query with large tool outputs could consume significant memory.
Additionally, the conversationHistory_ vector is bounded by maxHistoryMessages (good), but the per-query vectors have no equivalent limit.
Current State
stepResults — grows with every step, no pruning until processQuery() returns
toolCallHistory — same unbounded growth, used for loop detection
- Loop detection checks last 4 entries in
toolCallHistory — only needs recent history, not all history
- No memory budget or cap on total payload size
Proposed Solution
Bounded step results
AgentConfig config;
config.maxStepResultSize = 8192; // truncate individual tool results exceeding this size (bytes)
Large tool outputs are truncated with a "[truncated]" marker before being stored in stepResults. The full output is still passed to the LLM context (up to contextSize), but the stored results are bounded.
Sliding window for tool call history
Loop detection only needs the last N tool calls. Replace unbounded vector with a circular buffer or sliding window:
// Only keep last 10 tool calls for loop detection (currently checks last 4)
config.toolCallHistorySize = 10;
Memory tracking (optional)
// Approximate memory used by current query
size_t agent.currentQueryMemoryUsage() const;
Sums the sizes of stepResults, toolCallHistory, and active conversation messages. Useful for monitoring in long-running agents.
Deliverables
| Item |
Description |
maxStepResultSize config |
Truncate oversized tool results before storing in step history |
| Sliding window for loop detection |
Bound toolCallHistory to last N entries |
| Memory usage query |
Optional method to report approximate memory used by current query |
| Documentation |
Document memory management in custom-agent guide |
Problem
Within a single
processQuery()call, two vectors grow unbounded:std::vector<json> stepResults— accumulates every step's JSON resultstd::vector<std::pair<std::string, json>> toolCallHistory— tracks every tool call and its argumentsWhile
maxStepsprovides an indirect cap, each step can produce large JSON payloads (e.g., tool results with verbose command output). A 50-step query with large tool outputs could consume significant memory.Additionally, the
conversationHistory_vector is bounded bymaxHistoryMessages(good), but the per-query vectors have no equivalent limit.Current State
stepResults— grows with every step, no pruning untilprocessQuery()returnstoolCallHistory— same unbounded growth, used for loop detectiontoolCallHistory— only needs recent history, not all historyProposed Solution
Bounded step results
Large tool outputs are truncated with a
"[truncated]"marker before being stored instepResults. The full output is still passed to the LLM context (up tocontextSize), but the stored results are bounded.Sliding window for tool call history
Loop detection only needs the last N tool calls. Replace unbounded vector with a circular buffer or sliding window:
Memory tracking (optional)
Sums the sizes of
stepResults,toolCallHistory, and active conversation messages. Useful for monitoring in long-running agents.Deliverables
maxStepResultSizeconfigtoolCallHistoryto last N entries