Stop automatic memory compaction from firing repeatedly and from running while results are being gathered#5458
Open
andrescera wants to merge 2 commits into
Conversation
Introduce CompactionConfigSchema (preemptive_threshold, cooldown_ms, enabled) with behavior-preserving defaults (0.78 / 60000ms / true), wire it into the root config schema, and regenerate the JSON schema.
…ing result collection Re-arm a compacted session only after its usage ratio drops below (threshold - REARM_MARGIN) instead of every assistant message, skip compaction while background_output results are being collected, and read the threshold/cooldown from the new compaction config.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stop automatic memory compaction from firing repeatedly and from running while results are being gathered
The problem. To keep long sessions healthy, the assistant automatically condenses its memory when it gets close to full. In practice this kept happening over and over: it would condense, work for a moment, and condense again almost immediately — and it could even condense in the middle of gathering results from background work, throwing away the very information it had just collected. The point at which condensing kicks in was also fixed in the code with no way to adjust it.
When it started. This automatic condensing was introduced in v3.2.2 (commit 62e1687) and has behaved this way since.
The solution. Condensing now happens once when memory genuinely gets close to full, and does not trigger again until memory has actually come back down, so it no longer repeats every turn. It also holds off while the assistant is still collecting results from background work, so freshly gathered information is not discarded. Finally, the threshold and the cooldown are now configurable, with defaults that match today's behavior, for anyone who wants to tune when condensing happens.
Summary by cubic
Prevents auto memory compaction from re-firing each turn and from running while background results are being collected. Adds a configurable
compactionblock (threshold, cooldown, enabled) with defaults that keep current behavior.Bug Fixes
background_outputso fresh results aren’t discarded.New Features
compactionconfig:preemptive_threshold(0–1, default 0.78),cooldown_ms(default 60000),enabled(default true).Written for commit 5c36808. Summary will update on new commits.