Skip to content

feat: add runtime model fallback policy#52

Open
StartupBros wants to merge 4 commits intonicobailon:mainfrom
StartupBros:feat/model-fallback-policy
Open

feat: add runtime model fallback policy#52
StartupBros wants to merge 4 commits intonicobailon:mainfrom
StartupBros:feat/model-fallback-policy

Conversation

@StartupBros
Copy link

@StartupBros StartupBros commented Mar 12, 2026

Summary

  • add a shared runtime-model-fallback.ts policy module for candidate ordering, normalization, conservative failure classification, and session-scoped cooldowns
  • unify sync single, sync chain, sync parallel, async single, and async chain around the same fallback behavior and observability fields
  • fix async single-agent background launches so runtime model overrides are preserved and surfaced in status/log/result artifacts

Why

Today delegated runs resolve a model once and either succeed or fail. That created drift across execution paths and previously caused parity regressions around model overrides and thinking level handling in async mode.

This PR makes model choice an execution policy instead of a one-shot resolution, while staying intentionally narrow for v1:

  • explicit override → agent model → current session model → fallbackModels
  • retry only classified runtime/provider failures
  • fail fast on deterministic task/path/input errors
  • keep child pi invocations on --models

Implementation Notes

  • Added runtime-model-fallback.ts for:
    • candidate construction + dedupe
    • bare model normalization via registry metadata
    • retryable vs deterministic failure classification
    • session-scoped file-backed cooldowns
    • attempt/fallback summaries
  • Extended sync and async result/status types with:
    • requestedModel
    • finalModel
    • modelAttempts
    • fallbackSummary / lastFallbackReason
  • Routed execution.ts, chain-execution.ts, async-execution.ts, and subagent-runner.ts through the same policy context

Testing

  • pnpm test:all
  • Live Pi smoke tests after switching local runtime to this branch:
    • async single with explicit model override preserved in status.json
    • async single fallback from failing Anthropic agent model to session OpenAI model
    • async chain showed per-step attempt history and step-local fallback behavior
    • sync single and sync chain completed successfully under the shared fallback policy

Post-Deploy Monitoring & Validation

  • What to monitor/search
    • Logs:
      • Inspect async run artifacts under /tmp/pi-async-subagent-runs/<id>/
      • Search for fallback: messages in run logs
    • Metrics/Dashboards:
      • No centralized production dashboard for this OSS extension
  • Validation checks (queries/commands)
    • Run a background single-agent task and inspect:
      • subagent_status({ id: "<id>" })
      • /tmp/pi-async-subagent-runs/<id>/status.json
    • Verify fields:
      • requestedModel
      • finalModel
      • modelAttempts
      • lastFallbackReason
    • Run a chain where a later step uses an agent with a known-bad provider/model on the current machine and verify earlier successful steps are not replayed
  • Expected healthy behavior
    • retryable provider/runtime failures advance to the next candidate
    • deterministic task/path/input failures stop immediately
    • sync/async artifacts agree on attempt history and final model choice
  • Failure signal(s) / rollback trigger
    • async single drops runtime model overrides
    • fallback triggers for deterministic task/tool mistakes
    • chain retries replay already completed earlier steps
    • status/log/result artifacts disagree on model attempts
  • Validation window & owner
    • Window: first local dogfood session after install/update
    • Owner: maintainer/reviewer validating the extension locally

Compound Engineered

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c00ebec772

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const expiresAt = now + cooldownMinutes * 60 * 1000;
if (classification.cooldownScope === "provider") {
const provider = getProviderFromModel(candidate.normalizedModel ?? candidate.model);
if (!provider) return;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return the current store when provider parsing fails

When a retryable provider-scoped failure occurs on a bare model ID (for example gpt-4.1 without a provider prefix), updateCooldownStore hits if (!provider) return; and returns undefined. executeWithRuntimeModelFallback then stores that into cooldownStore and calls getCooldownSkipReason(next, cooldownStore) for remaining candidates, which crashes with Cannot read properties of undefined (reading 'models'). This converts a recoverable fallback path into a runtime exception whenever additional candidates are present.

Useful? React with 👍 / 👎.

@StartupBros
Copy link
Author

Addressed the Codex review note about provider-scoped cooldowns on bare model ids. It was a valid edge case: could return when provider parsing failed, which could poison the in-memory cooldown store during fallback iteration. Fixed in by returning the current store instead, and added a regression test covering that path.

@StartupBros
Copy link
Author

Correction to my previous note: the valid edge case was in updateCooldownStore() returning undefined when provider parsing failed for a provider-scoped cooldown on a bare model id. Fixed in 6ca81da by returning the current store instead, and added a regression test for that path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant