Token Optimization — Context Reduction and LLM-Assisted Compression#12259
Closed
JamesRobert20 wants to merge 47 commits intoRooCodeInc:mainfrom
Closed
Token Optimization — Context Reduction and LLM-Assisted Compression#12259JamesRobert20 wants to merge 47 commits intoRooCodeInc:mainfrom
JamesRobert20 wants to merge 47 commits intoRooCodeInc:mainfrom
Conversation
…ows across instances
Remove Roo Code Cloud and Router remnants
Roo to zoo upgrade
Add issue templates
feat: support OAuth 2.1 for streamable-http MCP servers
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What it does
Every time a tool like
read_fileorsearch_filesruns, its full output goes into conversation history and gets re-sent to the primary model on every subsequent API call. A singleread_fileon a 1,000-line file can add 8,000–15,000 tokens that stay in context for the entire task.This PR adds an invisible compression layer between tool execution and conversation history. Before a large tool result is stored, a cheap secondary model compresses it into a focused summary. The primary model sees less noise, the task runs cheaper, and the user sees nothing different in the UI.
All users benefit from env details diffing, old tool result truncation, parallel tool calls. LLM-assisted compression is subscription-only.
What changed
New files:
src/core/tools/ToolResultProcessor.ts—shouldCompress()+compress()with configurable per-tool thresholdssrc/core/tools/compressAndPush.ts— wrapper that replaces directpushToolResultcalls in tool handlerssrc/core/tools/resolveCompressionHandler.ts— async subscription check via Zoo Code API (1hr cache per key), returnsZooGatewayApiHandlerfor subscribers ornullfor free userssrc/core/tools/ToolResultProcessorConfig.ts— config interface + defaultssrc/core/tools/CompletionPostProcessor.ts— optional reformatting ofattempt_completionresult textsrc/api/providers/zoo-gateway.ts— routes compression calls to/api/proxy/internal/compressusing the user's Zoo Code API keysrc/core/context-management/compressToolResults.ts— truncates old tool results in long conversationssrc/core/environment/environmentDiff.ts— only sends changed env detail sections on turns 2+Modified files:
src/core/tools/{ReadFileTool,SearchFilesTool,ListFilesTool,CodebaseSearchTool,ExecuteCommandTool}.ts— usecompressAndPushToolResultinstead of rawpushToolResultsrc/core/task/Task.ts— async handler init on construction,isSubscriberflag,toolResultProcessorSettingsfrom global state,compressOldToolResultsin main loop, env diff trackingsrc/core/webview/webviewMessageHandler.ts— clears subscription cache whenzooCodeApiKeychangeswebview-ui/src/components/settings/SettingsView.tsx—zooCodeApiKeyinput fieldpackages/types/src/global-settings.ts—zooCodeApiKey+zooCodeBaseUrlin schema,toolResultProcessorSettingssrc/api/index.ts—taskIdoptional,toolNameadded toApiHandlerCreateMessageMetadatasrc/package.json—zooCodeApiKey/zooCodeBaseUrlVS Code settings contributionsTest coverage:
resolveCompressionHandler.spec.ts— 11 testsToolResultProcessor.spec.ts— 27 testscompressAndPush.spec.ts— 6 testsCompletionPostProcessor.spec.ts— 7 testscompressToolResults.spec.ts— context compression testsenvironmentDiff.spec.ts— env diff testsHow it works
Compression triggers by tool
read_filesearch_files/codebase_searchlist_filesexecute_commandAll thresholds are user-configurable via
toolResultProcessorSettingsin extension settings.Optimizations that apply to everyone
read_file/list_filesresults older than N turns replaced with[content omitted]parallelToolCalls: truein metadataTesting this manually
zoo_sk_...)read_fileon a large file (> 1,500 chars)Notes
zoo-gateway.tshas no unit test yet — toolName wiring via metadata is the one gap in test coverageInteractively review PR in Roo Code Cloud