fix(runtime-fallback): defer toast until dispatch succeeds, block restore of exhausted primary#5457
Open
EvangelosMoschou wants to merge 2 commits into
Open
Conversation
…persist fallback model Three fixes for code-yeongyu#5435: Fix 1 — Defer toast until dispatch succeeds (fallback-retry-dispatcher.ts): Toast was shown BEFORE autoRetryWithFallback was called, leading users to see 'Switching to...' even when the fallback dispatch failed or was gated. Now autoRetryWithFallback returns { accepted: boolean } and the toast fires only after a successful dispatch (success toast) or on failure (error toast). Fix 2 — Block restoring exhausted primary (chat-message-handler.ts): When runtime_fallback switches to a fallback model and the original model exceeds quota, chat.message was restoring the exhausted primary after the 60s cooldown expired. This caused sessions to loop back to the failing model. Now the original model stays in failedModels and is never auto-restored once it has failed. Fix 3 — Persist fallback model on session record (auto-retry-dispatch.ts): After a successful fallback dispatch, update the OpenCode session record with the new model so the core loop and subsequent streams pick up the fallback provider immediately, reducing the race condition against OpenCode's internal same-model retry. TDD: 230 existing runtime-fallback tests all pass (no regressions).
- Remove session.update call (not available in type definition)
- Fix callback type in auto-retry.ts to handle Promise<{ accepted: boolean }>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
With
runtime_fallback.enabled: true, when the primary model hits quota errors:chat.messagerestores the exhausted primary after 60s cooldown expiresFull root cause analysis: #5435
Fix 1 — Defer toast until dispatch succeeds (
fallback-retry-dispatcher.ts)Toast was shown before
autoRetryWithFallbackwas called. ChangedautoRetryWithFallbackto return{ accepted: boolean }instead ofvoid. The dispatcher now shows:Fix 2 — Block restoring exhausted primary (
chat-message-handler.ts)Added
!state.failedModels.has(state.originalModel)check before restoring the primary model. Once a model fails (e.g., quota_exceeded), it stays infailedModelsand is never auto-restored on subsequentchat.messagecalls — even after the 60s cooldown expires. The user must explicitly switch models to restore it.This prevents the infinite loop: quota error → fallback → 60s cooldown → restore exhausted primary → quota error again.
Fix 3 — (Attempted) Persist fallback model on session record
Was attempted via
ctx.client.session.update(), but the API does not exist on OpenCode SDK client types. Removed from the final patch. The other two fixes are sufficient to break the infinite retry loop.TDD Evidence
230 existing runtime-fallback tests all pass (no regressions).
failedModelscontains the original model{ accepted: boolean })Files
packages/omo-opencode/src/hooks/runtime-fallback/auto-retry-dispatch.ts(+10/-2) — return{ accepted: boolean }packages/omo-opencode/src/hooks/runtime-fallback/fallback-retry-dispatcher.ts(+36/-27) — defer toast after dispatchpackages/omo-opencode/src/hooks/runtime-fallback/chat-message-handler.ts(+1/-0) — block restore of failed primaryFixes #5435