fix(query): auto-recover from context-overflow errors by 0xfandom · Pull Request #1169 · Gitlawb/openclaude

0xfandom · 2026-05-14T12:43:21Z

Summary

Detect "context window exceeded" assistant messages from all three sources (Anthropic prompt-too-long, OpenAI-shim context_overflow category, Anthropic 500-with-context-keywords) via a new isContextOverflowMessage helper and an apiError: 'context_overflow' tag on the assistant message.
Wire a one-shot auto-compact + retry path in the query loop so the conversation recovers automatically instead of surfacing the error and dropping the current task. Covers external builds (no reactiveCompact / contextCollapse) and OpenAI-shim providers like Codex / GPT-5.5 that surface the limit through a 500 rather than the Anthropic PTL path.

Closes #1105.

Impact

user-facing impact: when a request fails with "exceeds the context window of this model" (Codex / GPT-5.5 today, also Anthropic PTL and 500-with-context overflow), the loop now silently compacts and retries the same intent. Previously the task halted and the user had to run /compact or /new manually.
developer/maintainer impact: new State.hasAttemptedContextOverflowRecovery field carried through every continue site in queryLoop; resets at the same boundaries as hasAttemptedReactiveCompact (new tool round, continuation nudge, token-budget continuation). One-shot per turn; the existing autocompact 3-strike circuit breaker (autoCompact.ts:274) handles deeper recursion if the post-compact retry overflows again. The new branch sits AFTER the existing reactiveCompact / contextCollapse branches, so internal builds keep their existing recovery and only fall through to this path if neither matched.

Testing

bun run build
bun run smoke — fails on main too (missing optional @orama/orama), not caused by this change.
focused tests: bun test src/services/api/errors.test.ts (new), bun test src/services/api/ src/services/compact/ (529 pass / 0 fail), bun test src/__tests__ (51 pass / 0 fail).
bun run typecheck — no new errors in the touched files; remaining errors are pre-existing on main.

Notes

provider/model path tested: I verified the Anthropic PTL and Anthropic 500-context paths via unit tests; couldn't reproduce the original Codex / GPT-5.5 1M-context overflow against a live account, so the OpenAI-shim path is exercised through the classifier's context_overflow category test only.
follow-up work or known limitations:
- Doesn't implement option 1 from the issue (pre-flight token-estimate + compact before send). That's the larger change @gnanam1990 flagged — wants a request-body token estimator that doesn't exist yet. Option 2 (retry-after-failure) is what this PR ships.
- On the post-compact retry the request still goes back through the provider; if a single tool result is so large that even the summary + that tool result blows the budget, the loop will surface the second context-overflow without a further retry (one-shot guard). Hitting that pathologically is what the circuit breaker exists to prevent.

Tag the three places that surface a 'context window exceeded' assistant message (Anthropic PTL, OpenAI-shim context_overflow category, Anthropic 500 with context keywords) with apiError: 'context_overflow' and add isContextOverflowMessage helper. Lets the query-loop recovery branch in the follow-up commit catch all three via a single predicate instead of duplicating string matchers, and keeps the content-prefix fallback so older sites that didn't get the tag are still recognised. Refs Gitlawb#1105

When a request fails because the conversation exceeds the provider context window, run a single auto-compact + retry instead of surfacing the error and stopping the turn. Covers external builds (no reactiveCompact / contextCollapse compiled in) and OpenAI-shim providers like Codex / GPT-5.5 that surface the limit through a 500 with context-overflow keywords rather than the Anthropic prompt-too-long path. Withholds the error in the streaming loop (parallel to the existing prompt-too-long withholding), runs compactConversation, replaces messagesForQuery with the post-compact summary, and continues the loop. Gated by hasAttemptedContextOverflowRecovery so a single turn cannot loop compact -> error -> compact forever, and the autocompact 3-strike circuit breaker in autoCompact.ts handles deeper recursion if the post-compact retry overflows again. Resets on each fresh tool round at the next_turn site so subsequent turns get a clean recovery attempt. Closes Gitlawb#1105

gnanam1990

Local checks on febcf78:

bun run build && node dist/cli.mjs --version — ✅ builds, prints 0.10.0 (OpenClaude)
bun test src/services/api/errors.test.ts — ✅ 6 pass
bun test src/services/compact src/services/api — ✅ 528 pass when isolated (the 1 fail in openaiShim.test.ts reproduces only under multi-file ordering; in isolation: 94/0 — pre-existing test-state-leak, unrelated)
tsc --noEmit — no new errors introduced (the 10 pre-existing failures on main remain unchanged)

Code review:

The one-shot hasAttemptedContextOverflowRecovery guard correctly mirrors hasAttemptedReactiveCompact — gated at the message-withhold site and the recovery branch, and reset at the same fresh-tool-round sites. No infinite-loop surface.
Reusing compactConversation(..., isAutoCompact=true) is a nice call — gets the existing 3-strike circuit breaker for free if the post-compact retry also overflows.
The isContextOverflowMessage fallback by content-prefix is a sensible safety net for older emit sites that didn't carry the apiError: 'context_overflow' tag. Tests cover all three sources (PTL, OpenAI-shim, Anthropic-500) plus the rejection cases.
No red flags (no tengu_*, no USER_TYPE === 'ant', no new network calls, no new deps, no CI diff).

LGTM.

jatmn

Findings

[P1] The new overflow recovery path also applies to compact/session-memory forks
src/query.ts:1300
This branch currently runs even when querySource is 'compact' or 'session_memory'. Those flows already have specialized oversized-context handling, and the compact worker has its own prompt-too-long retry path. If a compact fork hits a non-PTL context_overflow, this code can re-enter compactConversation() from inside queryLoop using the forked compact prompt as messagesForQuery / forkContextMessages instead of the original oversized conversation. That risks compacting the compact worker's own prompt rather than the real conversation payload, and it bypasses the dedicated compaction retry logic. Please guard this branch the same way the existing oversized-context logic guards compact/session-memory sources, and let those specialized callers handle their own recovery.

techbrewboss

Review summary

The context-overflow retry is valuable for the main user loop, but I think this needs one guard before merge: the new recovery path should not run for compact/session-memory fork queries. Those flows already have specialized oversized-context handling, and this branch can recursively invoke compaction from inside the compact worker with the fork prompt rather than the original conversation.

Findings

src/query.ts:1300 - Guard context-overflow recovery away from compact/session-memory query sources.
Impact: isWithheldContextOverflow currently applies regardless of querySource, so a querySource === 'compact' or 'session_memory' fork that hits a non-PTL context overflow will enter this branch and call compactConversation(messagesForQuery, ..., forkContextMessages: messagesForQuery). In those forked flows, messagesForQuery is the compact/session-memory worker prompt, not the original oversized conversation payload. That can compact the worker's prompt, bypass the dedicated compact retry behavior, and produce a misleading post-compact retry instead of letting the specialized caller handle the failure.
Suggested fix: mirror the existing compact/session-memory exclusion used by the pre-flight blocking-limit path and/or the specialized recovery paths, e.g. require querySource !== 'compact' && querySource !== 'session_memory' before setting isWithheldContextOverflow true.

Validation

I reviewed the PR diff and checked the surrounding query-loop logic locally. The existing blocking-limit path explicitly skips compact/session-memory sources because those are forked agents that inherit the full conversation and need their dedicated handlers; the new context-overflow recovery branch currently lacks the same guard.

@jatmn

Per @jatmn and @techbrewboss review on Gitlawb#1169: the new isWithheldContextOverflow branch was running regardless of querySource. Compact and session_memory forks pass the worker prompt as messagesForQuery, so recovering here would call compactConversation() with the worker prompt as forkContextMessages — bypassing the dedicated compact retry path and producing a misleading post-compact retry of the worker prompt rather than the real conversation. Mirror the existing pre-flight blocking-limit exclusion (~query.ts:691) and let the specialized fork callers handle their own oversized-context recovery.

0xfandom · 2026-05-15T06:28:22Z

Pushed 2d21f65 — added the querySource !== 'compact' && querySource !== 'session_memory' guard on isWithheldContextOverflow, mirroring the pre-flight blocking-limit exclusion at src/query.ts:~691. Compact/session-memory forks now fall through to their specialized handlers instead of re-entering compactConversation() with the worker prompt as forkContextMessages.

bun run build → green
bun test src/services/api/errors.test.ts src/services/compact → 14 pass / 0 fail

No new query-loop test added — there's no query.test.ts harness in the repo today and the pre-existing blocking-limit guard at line ~691 also relies on errors/compact unit coverage + manual repro. Happy to add one if you'd prefer, but it'd be the first of its kind.

Thanks @jatmn @techbrewboss for the catch.

jatmn

Thanks for following up on the earlier review. The recovery branch now skips compact and session_memory sources, but I found one remaining issue in the companion withhold path.

Findings

[P1] Don't withhold context-overflow errors for compact/session-memory forks
src/query.ts:923
The recovery branch now has the querySource !== 'compact' && querySource !== 'session_memory' guard, but the streaming withhold condition above it still withholds any isContextOverflowMessage(message) regardless of querySource. For a compact or session-memory fork that returns a context_overflow API error, this hides the message from the stream, then the guarded recovery branch correctly skips it, and the generic API-error early return later exits with reason: 'completed' without yielding the original error. That means the specialized compact/session-memory caller does not get the diagnostic/retry path it needs. Please apply the same query-source exclusion to the withhold condition, or otherwise ensure the skipped recovery path surfaces the original error.

Vasanthdev2004 · 2026-05-16T10:15:13Z

Blockers

Withhold path not guarded — The streaming withhold condition at src/query.ts:923 still withholds isContextOverflowMessage(message) regardless of querySource. For compact/session-memory forks, this hides the error from the stream, then the guarded recovery branch skips it, and the generic API-error early return exits without yielding the original error. The specialized caller doesn't get the diagnostic/retry path it needs.

Non-Blocking

Contributor has addressed the main recovery branch guard, but the withhold path still needs the same exclusion.
No query-loop test harness exists in the repo — relying on unit tests and manual repro.

Looks Good

Valuable feature — auto-recovers from context-overflow errors instead of dropping the task
One-shot guard prevents infinite loops
Reuses existing compactConversation with 3-strike circuit breaker
Covers all three overflow sources (Anthropic PTL, OpenAI-shim, Anthropic 500)
Good test coverage for the classifier

Verdict: Changes Requested — withhold path needs the same query-source exclusion as the recovery branch.

techbrewboss

Review summary

The auto-recovery path is useful and the new context-overflow classifier looks good, but there is still one guard mismatch that should block merge. The recovery branch now skips compact/session-memory fork queries, while the earlier streaming withhold path still hides the same errors for those fork sources.

Findings

src/query.ts:922 - Don’t withhold context-overflow errors for compact/session-memory forks.
Impact: The recovery branch now correctly skips querySource === 'compact' and 'session_memory', but the streaming withhold condition still hides any isContextOverflowMessage(message) before that branch runs. For compact/session-memory forks, that means the original context_overflow API error is withheld, the guarded recovery branch skips it, and the later generic API-error return exits without yielding the diagnostic to the specialized caller.
Suggested fix: Add the same querySource !== 'compact' && querySource !== 'session_memory' guard to the withhold condition, or otherwise ensure the skipped recovery path surfaces the original error.

Validation

Reviewed the PR metadata, full diff, existing reviews, references/openclaude.md, and the surrounding query-loop logic. I also ran bun test src/services/api/errors.test.ts in a detached PR worktree using the repo’s installed dependencies: 6 pass / 0 fail. No malicious or suspicious behavior found.

gnanam1990 · 2026-05-17T07:25:11Z

Confirmed @jatmn's and @techbrewboss's finding against the code — they've independently landed on the same real issue. The recovery branch correctly carries the querySource !== 'compact' && querySource !== 'session_memory' guard, but the streaming withhold condition above it (!hasAttemptedContextOverflowRecovery && isContextOverflowMessage(message)) does not. So for a compact/session-memory fork that returns a context_overflow error: the message is withheld from the stream, the guarded recovery branch correctly skips it, and the generic API-error return then exits with completed without ever yielding the original error — the specialized caller loses its diagnostic/retry path. Applying the same query-source guard to the withhold condition (or surfacing the original error when the guarded recovery is skipped) resolves it. The rest of the auto-recovery design looks sound. Thanks — happy to re-review promptly once that guard is mirrored.

0xfandom added 2 commits May 14, 2026 18:12

kevincodex1 requested review from Vasanthdev2004, anandh8x, gnanam1990, jatmn and techbrewboss May 14, 2026 13:57

gnanam1990 previously approved these changes May 14, 2026

View reviewed changes

jatmn requested changes May 14, 2026

View reviewed changes

techbrewboss requested changes May 14, 2026

View reviewed changes

0xfandom dismissed gnanam1990’s stale review via 2d21f65 May 15, 2026 06:28

0xfandom mentioned this pull request May 15, 2026

auto-switch to next configured provider on rate limit or quota exhaustion #768

Open

0xfandom requested review from gnanam1990, jatmn and techbrewboss May 15, 2026 06:49

jatmn requested changes May 15, 2026

View reviewed changes

techbrewboss requested changes May 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(query): auto-recover from context-overflow errors#1169

fix(query): auto-recover from context-overflow errors#1169
0xfandom wants to merge 3 commits into
Gitlawb:mainfrom
0xfandom:fix/1105-context-overflow-auto-recover

0xfandom commented May 14, 2026

Uh oh!

gnanam1990 left a comment

Uh oh!

jatmn left a comment

Uh oh!

techbrewboss left a comment

Uh oh!

0xfandom commented May 15, 2026

Uh oh!

jatmn left a comment

Uh oh!

Vasanthdev2004 commented May 16, 2026

Uh oh!

techbrewboss left a comment

Uh oh!

gnanam1990 commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

0xfandom commented May 14, 2026

Summary

Impact

Testing

Notes

Uh oh!

gnanam1990 left a comment

Choose a reason for hiding this comment

Uh oh!

jatmn left a comment

Choose a reason for hiding this comment

Findings

Uh oh!

techbrewboss left a comment

Choose a reason for hiding this comment

Review summary

Findings

Validation

Uh oh!

0xfandom commented May 15, 2026

Uh oh!

jatmn left a comment

Choose a reason for hiding this comment

Findings

Uh oh!

Vasanthdev2004 commented May 16, 2026

Blockers

Non-Blocking

Looks Good

Uh oh!

techbrewboss left a comment

Choose a reason for hiding this comment

Review summary

Findings

Validation

Uh oh!

gnanam1990 commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants