Feature/context management auto compaction #80

nicobailon · 2025-05-25T07:17:34Z

Add automatic context management for LLM workflows

Summary

Implements intelligent context window management to prevent LLM
token limit overflows
Provides two compaction strategies: truncation and LLM
summarization
Adds modern model support with accurate token counting for 2025
models
Includes comprehensive configuration options and thread safety

Key Features

Automatic monitoring: Tracks token usage and triggers compaction at
configurable thresholds
Smart compaction strategies:
- Truncation: Fast, preserves recent context
- LLM summarization: Intelligent summarization of older content
Modern model support: Updated configurations for GPT-4o, Claude
3.5, Gemini 2.0/2.5
Tool integration: Built-in tools automatically respect max_tokens
limits
Thread safety: Mutex synchronization for concurrent workflows
Flexible configuration: Extensive options for fine-tuning behavior

Technical Implementation

Context management integrated at the workflow level
Model-specific token counting (tiktoken for OpenAI, character
ratios for others)
Automatic model selection for summarization tasks
Comprehensive test coverage including concurrency and integration
tests
Enhanced documentation with concrete examples

This feature enables long-running AI workflows to operate reliably
within LLM context windows while preserving conversation quality and
coherence.

This update introduces a comprehensive context management system to handle token limits in long-running workflows. Key features include: - Automatic monitoring and compaction of conversation transcripts when approaching token limits. - Configurable strategies for context compaction: truncation and LLM summarization. - New configuration options for context management, including thresholds and maximum tokens. - Integration of context management into existing tools, allowing for token limits on outputs. Additionally, the README and documentation have been updated to reflect these changes, and new tests have been added to ensure functionality and concurrency safety. Relevant files: - `README.md`: Added documentation for automatic context management. - `docs/INSTRUMENTATION.md`: Updated to include context management events. - `lib/roast/helpers/content_truncator.rb`: New helper for truncating content based on token limits. - `lib/roast/workflow/context_manager.rb`: New class for managing context and compaction logic. - `lib/roast/workflow/model_config.rb`: Added model configuration for token limits. - `lib/roast/workflow/base_workflow.rb`: Integrated context management into the workflow. - `test/roast/workflow/context_manager_test.rb`: New tests for context management functionality. - `test/roast/workflow/context_management_integration_test.rb`: Integration tests for context management with tools. - `test/roast/workflow/context_management_concurrency_test.rb`: Tests for concurrency in context management. - `test/roast/workflow/model_config_test.rb`: Tests for model configuration related to token limits.

## Summary - Implements intelligent context window management to prevent LLM token limit overflows - Provides two compaction strategies: truncation and LLM summarization - Adds modern model support with accurate token counting for 2025 models - Includes comprehensive configuration options and thread safety ## Key Features - **Automatic monitoring**: Tracks token usage and triggers compaction at configurable thresholds - **Smart compaction strategies**: - Truncation: Fast, preserves recent context - LLM summarization: Intelligent summarization of older content - **Modern model support**: Updated configurations for GPT-4o, Claude 3.5, Gemini 2.0/2.5 - **Tool integration**: Built-in tools automatically respect max_tokens limits - **Thread safety**: Mutex synchronization for concurrent workflows - **Flexible configuration**: Extensive options for fine-tuning behavior ## Technical Implementation - Context management integrated at the workflow level - Model-specific token counting (tiktoken for OpenAI, character ratios for others) - Automatic model selection for summarization tasks - Comprehensive test coverage including concurrency and integration tests - Enhanced documentation with concrete examples This feature enables long-running AI workflows to operate reliably within LLM context windows while preserving conversation quality and coherence.

obie · 2025-05-26T18:22:33Z

Thanks for your contribution!

I know the PR is still in draft but here's some feedback on current version:

The PR is based on an old branch
The new files aren't integrated - The context management files aren't being required in the main library loading
Tests are failing
Implementation issues - The model matching algorithm has bugs where it doesn't correctly match partial model names

…upport - Added support for arm64-darwin-24 in Gemfile.lock. - Introduced a comprehensive context management guide detailing usage and testing of the automatic context management feature. - Updated helpers and workflow files to integrate new context management functionalities, including token counting and compaction strategies. - Enhanced tests to cover new context management features and ensure thread safety.

…ity with main branch while preserving the new context management features

- Merged latest upstream changes including new features and improvements - Resolved conflicts by preserving all context management features - Maintained compatibility with upstream error handling improvements - Added new upstream functionality while keeping our enhancements

…or resource handling in workflow_runner.rb - Updated prompt.md to reflect the correct variable name for resource contents. - Refactored resource handling in workflow_runner.rb to create a dedicated method for resource management based on file presence.

- Added a new section in README.md detailing automatic context management features, including configuration options and strategies. - Updated setup instructions for testing with OpenRouter, including API key configuration and example workflows.

nicobailon added 2 commits May 25, 2025 00:09

github-actions bot added the cla-needed label May 25, 2025

github-actions bot removed the cla-needed label May 27, 2025

nicobailon added 2 commits May 26, 2025 19:58

Updated /lib/roast/workflow/base_workflow.rb to maintain compatibil…

fe41bbe

…ity with main branch while preserving the new context management features

nicobailon marked this pull request as ready for review May 27, 2025 03:12

nicobailon marked this pull request as draft May 27, 2025 03:56

nicobailon marked this pull request as ready for review May 27, 2025 04:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/context management auto compaction #80

Feature/context management auto compaction #80

nicobailon commented May 25, 2025

Uh oh!

obie commented May 26, 2025

Uh oh!

Uh oh!

Feature/context management auto compaction #80

Are you sure you want to change the base?

Feature/context management auto compaction #80

Conversation

nicobailon commented May 25, 2025

Uh oh!

obie commented May 26, 2025

Uh oh!

Uh oh!