Skip to content

bug: auto-compact fires every turn for 3P models not in context window table (8k fallback causes infinite loop) #635

@Vasanthdev2004

Description

@Vasanthdev2004

Summary

Auto-compact fires on every single message when using a 3P provider (OpenAI, Gemini, GitHub, Mistral) with a model not in the context window lookup table, making the CLI unusable.

Root Cause

In src/utils/context.ts, when a 3P provider model isn't found in openaiContextWindows.ts, getContextWindowForModel() falls back to 8,000 tokens:

// Unknown models get a conservative 8k default so auto-compact triggers
// before hitting a hard context_window_exceeded error.
return 8_000

The auto-compact threshold math in src/services/compact/autoCompact.ts then produces a negative effective context window:

contextWindow = 8,000
reservedTokensForSummary = min(maxOutput, 20,000) = 8,000–20,000
effectiveContext = 8,000 - 8,000 = 0 (or negative)
autoCompactThreshold = 0 - 13,000 = -13,000

Since any message (even 1 token) exceeds -13,000, auto-compact triggers every turn.

Affected Models

Any model not in src/utils/model/openaiContextWindows.ts that doesn't prefix-match an existing entry. Confirmed examples:

  • MiniMax-M2.5, MiniMax-M2.5-highspeed, MiniMax-M2.1, MiniMax-M2.1-highspeed — all have 204,800 context per MiniMax docs, but only MiniMax-M2.7 is in the table, and prefix matching can't help (MiniMax-M2.5 doesn't start with MiniMax-M2.7)
  • gemini-3-flash, gemini-3.1-pro (before PR feat(models): update Gemini model context windows and output limits #602 merges)
  • Any custom/Ollama/LiteLLM model name that doesn't match the table

Fix Options

Quick fix

Add missing model entries to openaiContextWindows.ts (e.g., all MiniMax variants at 204,800 context / 131,072 max output).

Real fix

The 8,000 fallback in context.ts is too aggressive. It was intended to prevent context_window_exceeded errors, but it creates a worse problem — infinite compaction loops. Options:

  1. Raise the fallback to 128,000 (reasonable default for modern 3P models)
  2. Add a floor in getEffectiveContextWindowSize() so the result is always ≥ a usable minimum (e.g., Math.max(result, 50_000))
  3. Guard in autoCompact — skip auto-compact when the effective context is ≤ 0 (treat as "unknown, don't auto-compact")

Option 3 is the safest — it prevents the infinite loop without guessing the right context window for unknown models. The error-level and warning-level thresholds can still fire to inform the user.

Reproduction

  1. Set CLAUDE_CODE_USE_OPENAI=1 (or CLAUDE_CODE_USE_GEMINI=1, etc.)
  2. Configure a model not in the table, e.g., MiniMax-M2.5
  3. Start a conversation
  4. Observe: auto-compact triggers after every message

Environment

  • OpenClaude version: 0.1.8+ (current main)
  • Provider: Any 3P provider with a model not in the context window table

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions