Skip to content

Conversation

tedjames
Copy link
Contributor

@tedjames tedjames commented Jul 2, 2025

This PR implements a History Processor system in AgentKit. This system provides a flexible and composable way to modify and manage the conversation history that is sent to an LLM during an agent's execution.

The Problem

LLMs are highly sensitive to the context provided in the conversation history. Two primary challenges arise from unmanaged history:

  1. Tool Contamination: When the history contains tool_call and tool_result messages from previous agents, the current agent might be influenced to call tools it doesn't have access to, or to use tools inappropriately. This was the root cause of "Inference requested a non-existent tool" errors.
  2. Context Window Exceeded: As a conversation grows, the token count can exceed the model's context window, leading to API errors and high costs. Naively truncating this history can break required message structures (like tool_call/tool_result pairs), causing further errors.

The Solution: History Processors

History Processors are a chain of transformations applied to the message history before it's sent to the LLM. They provide a robust, configurable, and extensible way to solve the problems above (open to calling these something else besides processors)

Core Goals:

  • Correctness: Ensure message history is well-formed and compliant with model API requirements.
  • Control: Give developers fine-grained control over what context the LLM sees.
  • Cost Management: Reduce token usage by filtering irrelevant information and limiting history size.
  • Extensibility: Allow developers to create their own custom processors for tasks like PII redaction or message summarization.

Relationship to Existing APIs

It's important to note that modifying history before an agent run is already possible using the onStart lifecycle hook. For example, a developer could implement filtering logic directly within this hook:

const agentWithHook = createAgent({
  // ...
  lifecycle: {
    onStart: async ({ history, prompt, ...rest }) => {
      // custom filtering logic here
      const filteredHistory = history.filter((msg) => msg.type !== "tool_call");
      return { history: filteredHistory, prompt, ...rest, stop: false };
    },
  },
});

While this approach works, it can become verbose and difficult to reuse across multiple agents and networks.

The History Processor system provides a more explicit, declarative, and composable API for the same purpose. It abstracts common patterns like tool filtering and token limiting into reusable classes that can be easily configured and chained together, leading to cleaner and more maintainable code.


2. The HistoryProcessor Base Class

The foundation of the system is the HistoryProcessor abstract class. All processors extend this class.

// in packages/agent-kit/src/processors.ts
export abstract class HistoryProcessor {
  public readonly name: string;

  constructor(options: { name: string }) {
    this.name = options.name;
  }

  /**
   * Process the messages array. Must be side-effect-free and return a new array.
   */
  abstract process(messages: Message[]): MaybePromise<Message[]>;
}

Processors are applied sequentially using the applyProcessors helper function. The output of one processor becomes the input for the next.


3. Core Processors

We should consider supporting two essential core processors for users to pull in from AgentKit directly: ToolCallFilter and TokenLimiter

3.1. ToolCallFilter

This processor is designed to selectively remove tool_call and tool_result messages from the history.

Purpose: To prevent tool contamination and reduce token count by hiding irrelevant tool interactions from the LLM. This does not change the tools an agent has available to it; it only cleans the history.

Configuration:

The constructor accepts an options object with three mutually exclusive modes:

  1. Default (Exclude All): If no options are provided, it removes all tool calls and their corresponding results.
  2. exclude: string[]: Removes only the specified tools.
  3. include: string[]: Keeps only the specified tools, removing all others.
  4. persistResults: boolean: When true, instead of silently removing a tool call, it replaces it with a summary message (e.g., "Used search_tool tool").

Usage Examples:

import { ToolCallFilter } from "@inngest/agent-kit";

// Example 1: Exclude all tool calls (default)
const noToolsFilter = new ToolCallFilter();

// Example 2: Exclude a specific debugging tool
const excludeDebugFilter = new ToolCallFilter({ exclude: ["debug_log"] });

// Example 3: Only allow the 'search' tool to appear in history
const includeSearchFilter = new ToolCallFilter({ include: ["search"] });

// Example 4: Exclude a tool but leave a summary of its use
const summaryFilter = new ToolCallFilter({
  exclude: ["internal_analytics_ping"],
  persistResults: true,
});

3.2. TokenLimiter

This processor truncates the history to ensure it fits within a specified token limit.

Purpose: To prevent API errors from exceeding the model's context window and to manage costs.

Key Feature: Segment-based Truncation

A critical design choice in TokenLimiter is its handling of tool messages. LLM providers like OpenAI and Anthropic require that a tool_call message is always followed by its corresponding tool_result message. Simply truncating old messages can break these pairs.

The TokenLimiter solves this by grouping messages into "segments". A segment is a set of messages that form a complete interaction (e.g., a tool_call with multiple tools and all of their corresponding tool_result messages). The limiter then truncates from the oldest complete segments first, ensuring the history remains valid.

Pluggable Tokenizer System:

The TokenLimiter features a flexible tokenizer system:

  • It uses the tiktoken library for token counting if it's installed.
  • If tiktoken is not available, it gracefully falls back to a fast ApproximateTokenizer.
  • Developers can provide their own custom tokenizer implementation.

Configuration:

import { TokenLimiter } from "@inngest/agent-kit";

// Example 1: Simple limit (uses tiktoken if available, defaults to 'o200k_base' encoding)
const simpleLimiter = new TokenLimiter(8000);

// Example 2: Specify encoding for better accuracy with a specific model
const gpt4Limiter = new TokenLimiter({
  limit: 16000,
  encoding: "cl100k_base", // For GPT-3.5/4
});

// Example 3: Force the use of the fast approximation tokenizer
const fastLimiter = new TokenLimiter({
  limit: 4000,
  useTiktoken: false,
});

// Example 4: Provide a completely custom tokenizer
const myCustomTokenizer: Tokenizer = {
  /* ... */
};
const customLimiter = new TokenLimiter({
  limit: 8000,
  tokenizer: myCustomTokenizer,
});

4. API Integration: The Processor Hierarchy

To provide maximum flexibility, processors can be configured at three different levels. They are applied in a specific order, creating a waterfall of policies.

Order of Execution:

  1. HistoryConfig Processors: Applied first. Ideal for global, cross-agent policies that you want bundled in with your history adapter (for persistence)
  2. Network Processors: Applied second. Ideal when you dont want your history adapter to filter messages (making the history adapter more focused on persistence rather than filtering retrieved messages)
  3. Agent Processors: Applied last. Ideal for filtering out tool calls that you dont want your agent to be polluted with / confused by

Code Implementation (agent.ts):

// agent.ts: lines 229-238
const allProcessors = [
  ...(this.history?.processors || []), // 1. History config processors
  ...(network?.processors || []), // 2. Network processors
  ...(this.processors || []), // 3. Agent processors
];

if (allProcessors.length > 0) {
  history = await applyProcessors(history, allProcessors);
}

Example Scenario:

Imagine a multi-tenant customer support application.

// 1. HistoryConfig Level (Global Policy)
// Applied everywhere. No production history should ever contain debug tool calls.
const globalHistoryConfig: HistoryConfig = {
  // ... db connection details ...
  processors: [new ToolCallFilter({ exclude: ["debug_tool"] })],
};

// 2. Network Level
const customerSupportNetwork = createNetwork({
  name: "Pro Support Network",
  agents: [proSupportAgent, proEscalationAgent],
  history: globalHistoryConfig, // Inherits global policy
  processors: [new TokenLimiter({ limit: 16000, encoding: "cl100k_base" })],
});

// 3. Agent Level (Role-Specific Policy)
// This specific agent is a simple greeter and should not see complex tool history.
const greeterAgent = createAgent({
  name: "Greeter Agent",
  system: "You are a friendly greeter.",
  processors: [
    new ToolCallFilter(), // Exclude ALL tool calls from this agent's view
  ],
});

In this example, when greeterAgent runs within the customerSupportNetwork:

  1. ToolCallFilter({ exclude: ['debug_tool'] }) runs first.
  2. TokenLimiter({ limit: 16000, ... }) runs on the output of the first processor.
  3. ToolCallFilter() runs on the output of the second processor, removing any remaining tool calls just for this agent.

This hierarchical design allows for powerful, reusable, and clearly defined history management policies across an entire AI application.


5. Breaking Changes

There are no breaking changes introduced with the History Processor system. It is a purely additive feature. Existing code that does not define any processors will continue to function exactly as before. The default behavior remains unchanged.

Copy link

changeset-bot bot commented Jul 2, 2025

🦋 Changeset detected

Latest commit: 620da77

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@inngest/agent-kit Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@inngest-release-bot
Copy link
Contributor

🚀 Snapshot Release (alpha)

The latest changes of this PR are available as alpha on npm (based on the declared changesets):

Package Version Info
@inngest/agent-kit 0.10.0-alpha-20250702162718-42b01ef6484a86e0ec1af5eb17a803c1aae6ec4f npm ↗︎ unpkg ↗︎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants