-
Notifications
You must be signed in to change notification settings - Fork 24
Feature/context management auto compaction #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nicobailon
wants to merge
7
commits into
Shopify:main
Choose a base branch
from
nicobailon:feature/context-management-auto-compaction
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Feature/context management auto compaction #80
nicobailon
wants to merge
7
commits into
Shopify:main
from
nicobailon:feature/context-management-auto-compaction
+2,051
−1,779
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This update introduces a comprehensive context management system to handle token limits in long-running workflows. Key features include: - Automatic monitoring and compaction of conversation transcripts when approaching token limits. - Configurable strategies for context compaction: truncation and LLM summarization. - New configuration options for context management, including thresholds and maximum tokens. - Integration of context management into existing tools, allowing for token limits on outputs. Additionally, the README and documentation have been updated to reflect these changes, and new tests have been added to ensure functionality and concurrency safety. Relevant files: - `README.md`: Added documentation for automatic context management. - `docs/INSTRUMENTATION.md`: Updated to include context management events. - `lib/roast/helpers/content_truncator.rb`: New helper for truncating content based on token limits. - `lib/roast/workflow/context_manager.rb`: New class for managing context and compaction logic. - `lib/roast/workflow/model_config.rb`: Added model configuration for token limits. - `lib/roast/workflow/base_workflow.rb`: Integrated context management into the workflow. - `test/roast/workflow/context_manager_test.rb`: New tests for context management functionality. - `test/roast/workflow/context_management_integration_test.rb`: Integration tests for context management with tools. - `test/roast/workflow/context_management_concurrency_test.rb`: Tests for concurrency in context management. - `test/roast/workflow/model_config_test.rb`: Tests for model configuration related to token limits.
## Summary - Implements intelligent context window management to prevent LLM token limit overflows - Provides two compaction strategies: truncation and LLM summarization - Adds modern model support with accurate token counting for 2025 models - Includes comprehensive configuration options and thread safety ## Key Features - **Automatic monitoring**: Tracks token usage and triggers compaction at configurable thresholds - **Smart compaction strategies**: - Truncation: Fast, preserves recent context - LLM summarization: Intelligent summarization of older content - **Modern model support**: Updated configurations for GPT-4o, Claude 3.5, Gemini 2.0/2.5 - **Tool integration**: Built-in tools automatically respect max_tokens limits - **Thread safety**: Mutex synchronization for concurrent workflows - **Flexible configuration**: Extensive options for fine-tuning behavior ## Technical Implementation - Context management integrated at the workflow level - Model-specific token counting (tiktoken for OpenAI, character ratios for others) - Automatic model selection for summarization tasks - Comprehensive test coverage including concurrency and integration tests - Enhanced documentation with concrete examples This feature enables long-running AI workflows to operate reliably within LLM context windows while preserving conversation quality and coherence.
Thanks for your contribution! I know the PR is still in draft but here's some feedback on current version:
|
…upport - Added support for arm64-darwin-24 in Gemfile.lock. - Introduced a comprehensive context management guide detailing usage and testing of the automatic context management feature. - Updated helpers and workflow files to integrate new context management functionalities, including token counting and compaction strategies. - Enhanced tests to cover new context management features and ensure thread safety.
…ity with main branch while preserving the new context management features
- Merged latest upstream changes including new features and improvements - Resolved conflicts by preserving all context management features - Maintained compatibility with upstream error handling improvements - Added new upstream functionality while keeping our enhancements
…or resource handling in workflow_runner.rb - Updated prompt.md to reflect the correct variable name for resource contents. - Refactored resource handling in workflow_runner.rb to create a dedicated method for resource management based on file presence.
- Added a new section in README.md detailing automatic context management features, including configuration options and strategies. - Updated setup instructions for testing with OpenRouter, including API key configuration and example workflows.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add automatic context management for LLM workflows
Summary
token limit overflows
summarization
models
Key Features
configurable thresholds
3.5, Gemini 2.0/2.5
limits
Technical Implementation
ratios for others)
tests
This feature enables long-running AI workflows to operate reliably
within LLM context windows while preserving conversation quality and
coherence.