Skip to content

feat(ai): support for server-side compaction#17746

Draft
cdamus wants to merge 3 commits into
masterfrom
feat/17636-server-side-compaction
Draft

feat(ai): support for server-side compaction#17746
cdamus wants to merge 3 commits into
masterfrom
feat/17636-server-side-compaction

Conversation

@cdamus

@cdamus cdamus commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

What it does

Fixes #17636.

Adds provider-native (server-side) compaction for AI chat sessions (#17636). When a conversation grows past a supporting model's context limit, the provider summarizes older turns on its side so the session keeps working, instead of failing or forcing the user to start over or trim history manually.

Activation is layered, from broad to specific:

  • a global default (on by default), located under AI Features › Chat;
  • a per-provider override (follow the global setting / always on / always off);
  • a per-session override in the session settings dialog, which takes precedence over both.

Whether compaction is available is a model capability: it is currently honored by Anthropic and by OpenAI (Responses API), and providers or models without the capability simply ignore the setting. When compaction occurs it is shown inline in the chat and summarized in the token-usage tooltip (cumulative usage plus a "compacted N×" count), persisted with the session, and replayed on subsequent requests.

How to test

  1. Configure a supporting model (a recent Anthropic Claude model, or an OpenAI model using the Responses API) and leave the global AI Features › Chat: Server-Side Compaction setting on.
  2. Hold a conversation long enough to pass the model's context threshold. Confirm the session keeps responding, an inline compaction marker appears, and the token-usage tooltip reports a "compacted N×" count.
  3. Change the per-provider override and the per-session override and confirm each takes effect, with the session override winning.
  4. Repeat with a non-supporting model/provider and confirm the setting is ignored and behavior is unchanged.
  5. Reload the workspace and confirm the compaction marker and token usage are restored from the persisted session.

Follow-ups

  • Extend the capability to additional providers as their APIs gain support.
  • User configurability of the compaction threshold (Anthropic default is 150,000 input tokens)

Breaking changes

  • This PR introduces breaking changes and requires careful review. If yes, the breaking changes section in the changelog has been updated.

Attribution

Review checklist

Reminder for reviewers

cdamus added 3 commits June 30, 2026 15:37
Provider-agnostic infrastructure for server-side compaction:

- ai-core: `serverSideCompactionSupport` capability on model metadata
  (propagated to the frontend); `CompactionSettings` carried verbatim on
  the request; `resolveCompactionDefault` (global preference folded with
  the per-provider override) and the capability-gated
  `resolveServerSideCompaction`; the global
  `ai-features.serverSideCompaction` preference; opaque
  `CompactionResponsePart` / `CompactionMessage` marker types.
- ai-chat: persisted `CompactionChatResponseContent` (+ deserializer),
  agent stream-to-content mapping, per-session
  `commonSettings.compaction` copied verbatim onto the request (the
  agent reads no preference).
- ai-chat-ui: inline compaction marker renderer, token-usage tooltip
  (cumulative usage + "compacted Nx"), and the per-session tri-state
  control.
- ai-ide / ai-copilot / ai-ollama: tolerate the compaction marker
  (ignore foreign-provider markers in Chat Completions / Ollama
  conversion).

Signed-off-by: Christian W. Damus <cdamus@eclipsesource.com>
Declare the capability (= useResponseApi), fold the global and
per-provider preferences into the model's default enablement, enable
`context_management` compaction on Responses requests when active,
capture the streamed compaction item, and replay it via transcript
prefix-drop. Chat Completions ignores it.

Signed-off-by: Christian W. Damus <cdamus@eclipsesource.com>
Declare the capability (Opus/Sonnet 4.6+ heuristic), fold the global and
per-provider preferences into the model's default enablement, route
active requests through the Beta Messages API with the
compact-2026-01-12 beta and the compact_20260112 edit, capture the
streamed compaction block, and replay it while keeping surrounding
history. Default path unchanged.

Signed-off-by: Christian W. Damus <cdamus@eclipsesource.com>
@cdamus cdamus requested a review from eneufeld June 30, 2026 19:40
@github-project-automation github-project-automation Bot moved this to Waiting on reviewers in PR Backlog Jun 30, 2026
@cdamus cdamus changed the title Feat/17636 server side compaction feat(ai): support for server-side compaction Jun 30, 2026

@eneufeld eneufeld left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some follow up to allow to change the thresholds.
Other than that I have some nitpicks and we should merge this

public serverSideCompactionEnabledByDefault: boolean = false
) { }

get serverSideCompactionSupport(): boolean {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should move this to packages/ai-anthropic/src/node/anthropic-language-models-manager-impl.ts#122 resolveMetadata
there we might want to also use:
https://platform.claude.com/docs/en/api/typescript/beta/models/list
but reading the support for the models endpoint can be done in a follow up

}
const betaParams = params as T & Anthropic.Beta.Messages.MessageCreateParams;
betaParams.betas = ['compact-2026-01-12'];
betaParams.context_management = { edits: [{ type: 'compact_20260112' }] };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this defaults to 150k.
In a follow up we should make this configurable

public serverSideCompactionEnabledByDefault: boolean = false
) { }

get serverSideCompactionSupport(): boolean {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as for anthropic should be in the models-manager-impl

export const PREFERENCE_NAME_MAX_RETRIES = 'ai-features.modelSettings.maxRetries';
export const PREFERENCE_NAME_DEFAULT_NOTIFICATION_TYPE = 'ai-features.notifications.default';
export const PREFERENCE_NAME_SKILL_DIRECTORIES = 'ai-features.skills.skillDirectories';
export const PREFERENCE_NAME_SERVER_SIDE_COMPACTION = 'ai-features.chat.serverSideCompaction';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This preference is added to the schema below but not to AICoreConfiguration, so unlike its siblings it can't be read through the AICorePreferences proxy. Add [PREFERENCE_NAME_SERVER_SIDE_COMPACTION]: boolean | undefined; to that interface so access stays type-safe and consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Waiting on reviewers

Development

Successfully merging this pull request may close these issues.

Support provider-native server-side compaction for AI chat sessions

2 participants