fix(memory): count tokens for ToolCallBlock, ThinkingBlock, CitableBlock, CitationBlock by citizen204 · Pull Request #22153 · run-llama/llama_index

citizen204 · 2026-06-26T18:54:59Z

Summary

Memory._estimate_token_count() excluded ToolCallBlock from the scanned blocks entirely (it was filtered out alongside CachePoint), and had no counting branch for ThinkingBlock, CitableBlock, or CitationBlock — even though these three types were admitted into the blocks list.

For tool-using agents with substantial tool_kwargs, the FIFO queue accumulates far more real tokens than Memory believes, leaving the history well above token_limit and eventually surfacing as provider-side "prompt is too long" (HTTP 400) errors.

Fixes #21950

Changes

llama-index-core/llama_index/core/memory/memory.py:
- Remove ToolCallBlock from the CachePoint-only exclusion so it enters the blocks list.
- Add counting branch for ToolCallBlock: tokenize tool_name + str(tool_kwargs).
- Add counting branch for ThinkingBlock: use num_tokens when available, else tokenize content.
- Add counting branch for CitableBlock: tokenize title + source, then recurse into inner content blocks.
- Add counting branch for CitationBlock: tokenize title + source, then count cited_content.

…ock, CitationBlock _estimate_token_count excluded ToolCallBlock from the scanned blocks entirely, and had no counting branch for ThinkingBlock, CitableBlock, or CitationBlock even though they were admitted into the list. For tool-using agents with large tool_kwargs this caused the FIFO queue to stay well above the token_limit, surfacing as provider-side "prompt is too long" (HTTP 400) errors. Changes: - Remove ToolCallBlock from the CachePoint-only exclusion so it is included in the blocks list. - Add counting branches for ToolCallBlock (tool_name + kwargs serialized), ThinkingBlock (num_tokens if known, else content text), CitableBlock (title + source + inner content), and CitationBlock (title + source + cited_content). Fixes run-llama#21950

gautamvarmadatla · 2026-06-27T22:12:00Z

hi! please see #21951 it has the fix already :)

dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(memory): count tokens for ToolCallBlock, ThinkingBlock, CitableBlock, CitationBlock#22153

fix(memory): count tokens for ToolCallBlock, ThinkingBlock, CitableBlock, CitationBlock#22153
citizen204 wants to merge 1 commit into
run-llama:mainfrom
citizen204:fix-21950-toolcall-token-counting

citizen204 commented Jun 26, 2026

Uh oh!

gautamvarmadatla commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

citizen204 commented Jun 26, 2026

Summary

Changes

Uh oh!

gautamvarmadatla commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants