Feature: History Processors #190
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements a History Processor system in AgentKit. This system provides a flexible and composable way to modify and manage the conversation history that is sent to an LLM during an agent's execution.
The Problem
LLMs are highly sensitive to the context provided in the conversation history. Two primary challenges arise from unmanaged history:
tool_call
andtool_result
messages from previous agents, the current agent might be influenced to call tools it doesn't have access to, or to use tools inappropriately. This was the root cause of "Inference requested a non-existent tool" errors.tool_call
/tool_result
pairs), causing further errors.The Solution: History Processors
History Processors are a chain of transformations applied to the message history before it's sent to the LLM. They provide a robust, configurable, and extensible way to solve the problems above (open to calling these something else besides processors)
Core Goals:
Relationship to Existing APIs
It's important to note that modifying history before an agent run is already possible using the
onStart
lifecycle hook. For example, a developer could implement filtering logic directly within this hook:While this approach works, it can become verbose and difficult to reuse across multiple agents and networks.
The History Processor system provides a more explicit, declarative, and composable API for the same purpose. It abstracts common patterns like tool filtering and token limiting into reusable classes that can be easily configured and chained together, leading to cleaner and more maintainable code.
2. The
HistoryProcessor
Base ClassThe foundation of the system is the
HistoryProcessor
abstract class. All processors extend this class.Processors are applied sequentially using the
applyProcessors
helper function. The output of one processor becomes the input for the next.3. Core Processors
We should consider supporting two essential core processors for users to pull in from AgentKit directly:
ToolCallFilter
andTokenLimiter
3.1.
ToolCallFilter
This processor is designed to selectively remove
tool_call
andtool_result
messages from the history.Purpose: To prevent tool contamination and reduce token count by hiding irrelevant tool interactions from the LLM. This does not change the tools an agent has available to it; it only cleans the history.
Configuration:
The constructor accepts an options object with three mutually exclusive modes:
exclude: string[]
: Removes only the specified tools.include: string[]
: Keeps only the specified tools, removing all others.persistResults: boolean
: Whentrue
, instead of silently removing a tool call, it replaces it with a summary message (e.g., "Used search_tool tool").Usage Examples:
3.2.
TokenLimiter
This processor truncates the history to ensure it fits within a specified token limit.
Purpose: To prevent API errors from exceeding the model's context window and to manage costs.
Key Feature: Segment-based Truncation
A critical design choice in
TokenLimiter
is its handling of tool messages. LLM providers like OpenAI and Anthropic require that atool_call
message is always followed by its correspondingtool_result
message. Simply truncating old messages can break these pairs.The
TokenLimiter
solves this by grouping messages into "segments". A segment is a set of messages that form a complete interaction (e.g., atool_call
with multiple tools and all of their correspondingtool_result
messages). The limiter then truncates from the oldest complete segments first, ensuring the history remains valid.Pluggable Tokenizer System:
The
TokenLimiter
features a flexible tokenizer system:tiktoken
library for token counting if it's installed.tiktoken
is not available, it gracefully falls back to a fastApproximateTokenizer
.Configuration:
4. API Integration: The Processor Hierarchy
To provide maximum flexibility, processors can be configured at three different levels. They are applied in a specific order, creating a waterfall of policies.
Order of Execution:
HistoryConfig
Processors: Applied first. Ideal for global, cross-agent policies that you want bundled in with your history adapter (for persistence)Network
Processors: Applied second. Ideal when you dont want your history adapter to filter messages (making the history adapter more focused on persistence rather than filtering retrieved messages)Agent
Processors: Applied last. Ideal for filtering out tool calls that you dont want your agent to be polluted with / confused byCode Implementation (
agent.ts
):Example Scenario:
Imagine a multi-tenant customer support application.
In this example, when
greeterAgent
runs within thecustomerSupportNetwork
:ToolCallFilter({ exclude: ['debug_tool'] })
runs first.TokenLimiter({ limit: 16000, ... })
runs on the output of the first processor.ToolCallFilter()
runs on the output of the second processor, removing any remaining tool calls just for this agent.This hierarchical design allows for powerful, reusable, and clearly defined history management policies across an entire AI application.
5. Breaking Changes
There are no breaking changes introduced with the History Processor system. It is a purely additive feature. Existing code that does not define any processors will continue to function exactly as before. The default behavior remains unchanged.