You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
↓ dedup_tool_results() # Deduplicate within merged messages
76
+
↓ reduce_context_with_budget() # Budget-aware trimming (if over threshold)
76
77
↓
77
78
Vec<LLMMessage> # Provider-neutral messages
78
79
↓
@@ -177,6 +178,22 @@ The codebase uses **three layers** to prevent invalid message sequences:
177
178
2.**Pre-API sanitization** (`sanitize_tool_results`): Dedup and remove orphans from `Vec<ChatMessage>` before every API call
178
179
3.**Context manager** (`task_board_context_manager.rs`): Merge consecutive same-role messages and dedup tool_results in the `reduce_context()` pipeline
179
180
181
+
### Context Trimming with Cache Preservation
182
+
183
+
Long sessions accumulate messages that approach the context window limit. The `TaskBoardContextManager` implements budget-aware trimming:
184
+
185
+
1.**Lazy trimming**: Only triggers when estimated tokens exceed `context_window × threshold` (default 80%)
186
+
2.**Stable prefix**: Trimmed messages are replaced with `[trimmed]` placeholders, preserving message structure (roles, tool_call_ids) for API validity
187
+
3.**Cache-friendly**: The trimmed prefix produces identical output across turns, so Anthropic's prompt cache stays valid
188
+
4.**Metadata persistence**: Trimming state (`trimmed_up_to_message_index`) is stored in `CheckpointState.metadata` and flows through:
0 commit comments