Skip to content

fix(codex): respect non-Claude model selection and OAuth, fix demo summary#2086

Merged
chernistry merged 1 commit into
mainfrom
fix/2075-codex-oauth-routing
Jun 25, 2026
Merged

fix(codex): respect non-Claude model selection and OAuth, fix demo summary#2086
chernistry merged 1 commit into
mainfrom
fix/2075-codex-oauth-routing

Conversation

@chernistry

@chernistry chernistry commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes the three defects in #2075 that made the Codex adapter unusable with a ChatGPT OAuth login.

1. Model routing handed Claude tier names to Codex

The batch/heuristic selector emits opus/sonnet/haiku with no adapter awareness, so a high-stakes role (manager/architect/security) produced codex exec -m opus, which Codex rejects. The spawner now substitutes the adapter's default model for an unpinned Claude tier name when the run-level adapter is non-Claude, so the model recorded for the run matches what actually runs. The Claude path is byte-identical (gated on a non-Claude adapter). CodexAdapter also maps any residual tier name to its default as a last-resort net.

2. Spurious OPENAI_API_KEY warning despite OAuth

The adapter warned on every spawn when OPENAI_API_KEY was absent, even with a valid ChatGPT OAuth session. It now detects ~/.codex/auth.json (written by codex login) and only warns when neither an API key nor an OAuth session is present.

3. bernstein demo --real crash

_print_demo_summary read the /status tasks field as a list, but the endpoint returns {"count", "items"}. Iterating the dict yielded its string keys and raised AttributeError: 'str' object has no attribute 'get'. It now unwraps the items list and keeps only dict rows.

Tests

  • New tests/unit/test_model_coercion.py for the adapter-aware model coercion (Claude path unchanged).
  • New cases in tests/unit/test_adapter_codex.py: OAuth session suppresses the warning; a Claude tier name maps to the Codex default in argv.
  • New case in tests/unit/test_cli_demo.py: the demo summary handles the real /status shape without crashing (reproduces the reported AttributeError before the fix).
  • Local runs green across model-coercion, codex adapter, demo, router_core, warm_pool, cascade_router, non-claude adapter, registry, spawner, and adapter conformance suites (700+ tests). ruff check and ruff format clean.

Fixes #2075

Summary by Sourcery

Ensure Codex adapter works correctly with ChatGPT OAuth and non-Claude model selections, and prevent the demo status summary from crashing on the current API response shape.

Bug Fixes:

  • Prevent Codex spawns from failing or recording invalid models when the selector returns Claude tier names for non-Claude adapters.
  • Avoid emitting spurious OPENAI_API_KEY warnings when a valid Codex OAuth session is present.
  • Handle the /status tasks payload returned as a {count, items} dict so the demo summary no longer crashes.

Enhancements:

  • Introduce adapter-aware model coercion in the spawner so unpinned Claude tier names are mapped to the target adapter’s default model, keeping recorded and executed models aligned.
  • Expose a default_model on CodexAdapter and add a defensive model mapping helper to normalize any residual Claude tier names.

Tests:

  • Add unit tests for adapter-aware model coercion across Claude and non-Claude adapters.
  • Extend Codex adapter tests to cover OAuth-based authentication behavior and model mapping to the default Codex model.
  • Add a demo CLI test to verify the real /status response shape is summarized correctly without raising errors.

Summary by CodeRabbit

  • New Features

    • Improved model selection: when a Claude-tier model name is chosen with a non-Claude adapter, it’s automatically normalized to the adapter’s compatible default.
    • Enhanced Codex support by selecting the correct model and detecting either API-key or existing Codex OAuth credentials.
  • Bug Fixes

    • Made demo status rendering resilient to multiple task payload shapes from the status endpoint.
  • Testing

    • Added/expanded unit tests covering Codex credential warnings, worker model arguments, model coercion behavior, and the updated demo summary rendering.

@sourcery-ai

sourcery-ai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Reviewer's Guide

Codex adapter and spawner now correctly handle Claude tier model names for non-Claude runs, respect ChatGPT OAuth-based Codex auth, and the demo summary is hardened against the real /status tasks payload shape, with tests added around these behaviors.

Sequence diagram for non-Claude model coercion and Codex OAuth-aware spawn

sequenceDiagram
    participant SpawnerCore as SpawnerCore
    participant BanditRouter as BanditRouter
    participant CodexAdapter as CodexAdapter
    participant Env as Env
    participant CodexCLI as CodexCLI

    SpawnerCore->>BanditRouter: router_applicable(adapter_name)
    BanditRouter-->>SpawnerCore: is_claude_compatible
    SpawnerCore->>SpawnerCore: _coerce_model_for_non_claude_adapter(model_config, adapter_name, adapter_default_model)
    SpawnerCore->>CodexAdapter: spawn(model_config)

    CodexAdapter->>Env: _has_codex_auth()
    Env-->>CodexAdapter: OPENAI_API_KEY or ~/.codex/auth.json
    CodexAdapter->>CodexAdapter: _codex_model(model_config.model)
    CodexAdapter->>CodexCLI: codex exec -m model
    CodexCLI-->>CodexAdapter: session result
    CodexAdapter-->>SpawnerCore: spawned session info
Loading

File-Level Changes

Change Details Files
Codex adapter now has explicit defaults, model coercion, and unified auth detection for API key vs OAuth.
  • Introduce _CODEX_AUTH_FILE constant and _has_codex_auth helper to treat either OPENAI_API_KEY or ~/.codex/auth.json as valid Codex credentials.
  • Add _DEFAULT_CODEX_MODEL and _CLAUDE_TIER_MODELS plus _codex_model helper to map Claude tier names to a valid Codex model with a warning.
  • Expose default_model on CodexAdapter so the spawner can substitute it when needed.
  • Update spawn() to use _has_codex_auth for warnings and pass the coerced model through to the CLI command and metadata.
src/bernstein/adapters/codex.py
Spawner normalizes heuristic/batch-selected Claude tier names when a non-Claude adapter is used and no model is operator-pinned.
  • Introduce _CLAUDE_TIER_MODELS and _coerce_model_for_non_claude_adapter to replace unpinned Claude tier names with the adapter default unless the adapter is Claude-compatible or no default is known.
  • Invoke _coerce_model_for_non_claude_adapter in _spawn_for_tasks_internal when provider_name is None and neither task nor role policy specifies a model, using the adapter's default_model attribute if present.
src/bernstein/core/agents/spawner_warm_pool.py
src/bernstein/core/agents/spawner_core.py
Demo summary now correctly handles /status.tasks as a dict containing items and is robust to type variations.
  • Change _print_demo_summary to unwrap tasks from {"count", "items"} when tasks is a dict, tolerate a bare list, filter non-dict entries, and coerce total_cost_usd to a float safely.
  • Add a regression test that mocks /status to return the dict-shaped tasks payload and asserts the summary renders without raising and shows the expected counts and cost.
src/bernstein/cli/run_confirm.py
tests/unit/test_cli_demo.py
New unit tests cover model coercion for non-Claude adapters and Codex-specific behaviors around OAuth and model selection.
  • Add tests/unit/test_model_coercion.py to verify Claude tier replacement with adapter defaults, unchanged behavior for Claude adapters, pass-through for non-tier models, and no-default behavior.
  • Extend tests/unit/test_adapter_codex.py to: rename and refine the missing-auth warning test, add a no-warning-with-OAuth-session case by patching _CODEX_AUTH_FILE, and assert that a Claude tier name reaching CodexAdapter.spawn is mapped to gpt-5.4 in argv.
tests/unit/test_model_coercion.py
tests/unit/test_adapter_codex.py

Assessment against linked issues

Issue Objective Addressed Explanation
#2075 Ensure that when the Codex adapter is used, Claude-specific tier names (opus/sonnet/haiku) are not passed as the model to codex exec, and instead a Codex-compatible model is used (respecting user configuration or falling back to a sensible Codex default).
#2075 Update the Codex adapter’s authentication handling so that it does not warn about a missing OPENAI_API_KEY when a valid ChatGPT OAuth session exists (as indicated by ~/.codex/auth.json).
#2075 Fix the bernstein demo --real crash at the summary stage by correctly handling the /status response shape for tasks and avoiding the AttributeError during summary rendering.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions

Copy link
Copy Markdown
Contributor

Sonar insights (advisory, no merge-block)

Snapshot of bernstein on the configured Sonar instance:

Metric Value
Coverage 80.1
Code smells 0
Bugs 0
Vulnerabilities 0
Security hotspots 0

Run bernstein doctor sonar locally for the full surface.

This comment is a soft signal. The Sonar scan runs on push to main; the PR check itself never fails on smells.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR updates Codex model selection to detect OAuth auth, remap Claude-tier model names for non-Claude adapters, and coerce spawned models to adapter defaults when needed. It also makes demo summary parsing accept multiple /status task shapes.

Changes

Codex model selection

Layer / File(s) Summary
Codex auth and model helpers
src/bernstein/adapters/codex.py
Codex adapter imports Path, defines auth/model helpers, and adds a default_model class attribute.
Codex spawn model mapping
src/bernstein/adapters/codex.py, tests/unit/test_adapter_codex.py
Codex spawn warns when both API-key and OAuth-session checks fail, maps the requested model before building the worker command, and passes the mapped model into build_worker_cmd; the adapter tests cover the warning and mapping behavior.
Non-Claude model coercion
src/bernstein/core/agents/spawner_warm_pool.py, src/bernstein/core/agents/spawner_core.py, tests/unit/test_model_coercion.py
The warm-pool helper rewrites Claude-tier model names to adapter defaults for non-Claude adapters, and AgentSpawner._spawn_for_tasks_internal applies that coercion when no model is pinned; the coercion tests cover the helper behavior.

Demo summary parsing

Layer / File(s) Summary
Status task parsing
src/bernstein/cli/run_confirm.py, tests/unit/test_cli_demo.py
_print_demo_summary handles tasks as either a dict with items or a bare list, filters dict task rows, and the regression test covers the dict-shaped payload.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title matches the main changes: Codex model routing/OAuth handling plus the demo summary fix.
Description check ✅ Passed The description covers what, why, how, and tests well enough, even though the checklist section is not filled out.
Linked Issues check ✅ Passed The changes address #2075 by fixing Codex model coercion, OAuth credential detection, and the demo summary crash.
Out of Scope Changes check ✅ Passed The changes stay focused on the Codex/OAuth and demo crash fixes, with supporting tests and no obvious unrelated edits.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/2075-codex-oauth-routing

Comment @coderabbitai help to get the list of available commands.

…mmary (#2075)

Three defects reported in #2075 made the Codex adapter unusable with a
ChatGPT OAuth login.

1. Model routing handed Claude tier names to Codex. The batch/heuristic
   selector emits opus/sonnet/haiku with no adapter awareness, so a
   high-stakes role (manager/architect/security) produced `codex exec -m
   opus`, which Codex rejects. The spawner now substitutes the adapter's
   default model for an unpinned Claude tier name when the run-level adapter
   is non-Claude, so the model recorded for the run matches what actually
   runs. The Claude path is unchanged. CodexAdapter also maps any residual
   tier name to its default as a last-resort net.

2. Spurious OPENAI_API_KEY warning. The adapter warned on every spawn when
   the env var was absent, even with a valid ChatGPT OAuth session. It now
   detects ~/.codex/auth.json (written by `codex login`) and only warns when
   neither an API key nor an OAuth session is present.

3. `bernstein demo --real` crash. _print_demo_summary read the /status
   `tasks` field as a list, but the endpoint returns {"count", "items"}.
   Iterating the dict yielded its string keys and raised AttributeError on
   `.get`. It now unwraps the items list and keeps only dict rows.

Fixes #2075
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review-bot acknowledgement summary

  • Must-address findings: 0 (0 acknowledged, 0 open)
  • Informational findings: 5

All must-address findings are resolved or acknowledged.

@github-actions

Copy link
Copy Markdown
Contributor

bernstein doctor observe for PR #2086 (fix/2075-codex-oauth-routing): ok=1, warn=0, fail=1, error=0, skipped=2

sonar -- OK (project bernstein)

metric value delta threshold status
coverage_pct 80.1% new 80.0% ok
code_smells 0 new 50 ok
bugs 0 new 0 ok
vulnerabilities 0 new 0 ok
security_hotspots 0 new 0 ok

code-scanning -- FAIL (39 open alert(s))

metric value delta threshold status
open_alerts 39 new 0 fail
critical_alerts 1 new 0 fail
high_alerts 18 new 0 fail
medium_alerts 3 new - ok
low_alerts 0 new - ok
Skipped backends (credentials not configured)
  • glitchtip: BERNSTEIN_GLITCHTIP_TOKEN not set
  • dt: DTRACK_URL/TOKEN/PROJECT not set

See docs/observability/unified-doctor.md for backend setup notes.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The Claude tier model set is now duplicated in both bernstein.adapters.codex and spawner_warm_pool; consider centralizing _CLAUDE_TIER_MODELS to avoid divergence if tier names change.
  • The Codex default model string "gpt-5.4" is hardcoded in the adapter and tests; it may be safer to expose this as a single configurable constant or setting so changes to the default don’t require code edits in multiple places.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The Claude tier model set is now duplicated in both `bernstein.adapters.codex` and `spawner_warm_pool`; consider centralizing `_CLAUDE_TIER_MODELS` to avoid divergence if tier names change.
- The Codex default model string `"gpt-5.4"` is hardcoded in the adapter and tests; it may be safer to expose this as a single configurable constant or setting so changes to the default don’t require code edits in multiple places.

## Individual Comments

### Comment 1
<location path="src/bernstein/core/agents/spawner_warm_pool.py" line_range="93-97" />
<code_context>
+        return model_config
+    if not adapter_default_model:
+        return model_config
+    return ModelConfig(
+        model=adapter_default_model,
+        effort=model_config.effort,
+        max_tokens=model_config.max_tokens,
+        is_batch=model_config.is_batch,
+    )
+
</code_context>
<issue_to_address>
**suggestion:** Consider constructing the coerced ModelConfig in a way that preserves future fields automatically.

Reconstructing `ModelConfig` with a hardcoded subset of fields will drop any new or optional attributes added later. If `ModelConfig` is a dataclass, consider using `dataclasses.replace(model_config, model=adapter_default_model)` (or an equivalent pattern) so only `model` changes and other fields are preserved automatically.

Suggested implementation:

```python
    if not adapter_default_model:
        return model_config
    # Preserve all existing and future fields on ModelConfig, only overriding `model`
    return dataclasses.replace(model_config, model=adapter_default_model)

```

To support `dataclasses.replace`, ensure this file imports `dataclasses` (or `replace` directly) near the top, for example:
- `import dataclasses`
or
- `from dataclasses import replace` and then use `replace(model_config, model=adapter_default_model)` instead of `dataclasses.replace(...)`.

If `ModelConfig` is not a dataclass but a Pydantic model or similar, replace the final line with an appropriate copy/update method, e.g.:
- `return model_config.model_copy(update={"model": adapter_default_model})`
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +93 to +97
return ModelConfig(
model=adapter_default_model,
effort=model_config.effort,
max_tokens=model_config.max_tokens,
is_batch=model_config.is_batch,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider constructing the coerced ModelConfig in a way that preserves future fields automatically.

Reconstructing ModelConfig with a hardcoded subset of fields will drop any new or optional attributes added later. If ModelConfig is a dataclass, consider using dataclasses.replace(model_config, model=adapter_default_model) (or an equivalent pattern) so only model changes and other fields are preserved automatically.

Suggested implementation:

    if not adapter_default_model:
        return model_config
    # Preserve all existing and future fields on ModelConfig, only overriding `model`
    return dataclasses.replace(model_config, model=adapter_default_model)

To support dataclasses.replace, ensure this file imports dataclasses (or replace directly) near the top, for example:

  • import dataclasses
    or
  • from dataclasses import replace and then use replace(model_config, model=adapter_default_model) instead of dataclasses.replace(...).

If ModelConfig is not a dataclass but a Pydantic model or similar, replace the final line with an appropriate copy/update method, e.g.:

  • return model_config.model_copy(update={"model": adapter_default_model})

@chernistry chernistry force-pushed the fix/2075-codex-oauth-routing branch from b507986 to bed1d64 Compare June 25, 2026 09:38

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/bernstein/cli/run_confirm.py`:
- Around line 459-464: The cost extraction in run_confirm.py’s payload handling
only reads the top-level total_cost_usd alias, so update the summary-building
logic to also fall back to summary.cost_usd and costs.spent_usd when computing
total_cost. Keep the task parsing flow intact, but adjust the total_cost
assignment path in the same block so the demo summary reflects the actual
/status spend even when the alias is missing.

In `@src/bernstein/core/agents/spawner_core.py`:
- Around line 1807-1812: The fallback in spawner_core.py only runs when
provider_name is None, so non-Claude providers like codex keep the Claude model
selection recorded in session.model_config and the initial trace. Update the
guard around _coerce_model_for_non_claude_adapter in the agent spawner flow so
it also applies when _resolve_routing() selects a non-Claude provider_name,
ensuring the recorded model matches the provider’s actual default before tracing
or persisting selection.
- Around line 1810-1811: The caching path is missing the adapter fallback model
because CachingAdapter does not expose default_model, so spawner_core’s
getattr(self._adapter, "default_model", None) returns None. Update
CachingAdapter in caching_adapter.py to proxy default_model from self._inner (or
otherwise expose it), so _coerce_model_for_non_claude_adapter can read the real
adapter’s fallback model when caching is enabled.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 663ec9a0-8eb3-4860-9998-0ec05f8d994b

📥 Commits

Reviewing files that changed from the base of the PR and between 11798c3 and b507986.

📒 Files selected for processing (7)
  • src/bernstein/adapters/codex.py
  • src/bernstein/cli/run_confirm.py
  • src/bernstein/core/agents/spawner_core.py
  • src/bernstein/core/agents/spawner_warm_pool.py
  • tests/unit/test_adapter_codex.py
  • tests/unit/test_cli_demo.py
  • tests/unit/test_model_coercion.py

Comment on lines +459 to +464
raw_tasks = payload.get("tasks", [])
if isinstance(raw_tasks, dict):
raw_tasks = raw_tasks.get("items", [])
if isinstance(raw_tasks, list):
tasks_data = [t for t in raw_tasks if isinstance(t, dict)]
total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look for where total_cost_usd is set on the /status payload and how status.py derives spend.
rg -nP -C3 '\btotal_cost_usd\b' src/bernstein/core/routes/
rg -nP -C3 'cost_usd|spent_usd' src/bernstein/core/routes/status_dashboard.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 15089


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '820,860p' src/bernstein/core/routes/status_dashboard.py
printf '\n---\n'
sed -n '1,260p' src/bernstein/cli/status.py
printf '\n---\n'
sed -n '440,490p' src/bernstein/cli/run_confirm.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 12828


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '820,860p' src/bernstein/core/routes/status_dashboard.py
printf '\n---\n'
sed -n '1,260p' src/bernstein/cli/status.py
printf '\n---\n'
sed -n '440,490p' src/bernstein/cli/run_confirm.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 12828


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '820,860p' src/bernstein/core/routes/status_dashboard.py
printf '\n---\n'
sed -n '1,260p' src/bernstein/cli/status.py
printf '\n---\n'
sed -n '440,490p' src/bernstein/cli/run_confirm.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 12828


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "def _extract_spent_cost|_extract_spent_cost\(" src/bernstein/cli/status.py src/bernstein/cli/run_confirm.py
printf '\n---\n'
sed -n '260,360p' src/bernstein/cli/status.py
printf '\n---\n'
sed -n '360,460p' src/bernstein/cli/status.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 8313


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "def _extract_spent_cost|_extract_spent_cost\(" src/bernstein/cli/status.py src/bernstein/cli/run_confirm.py
printf '\n---\n'
sed -n '260,360p' src/bernstein/cli/status.py
printf '\n---\n'
sed -n '360,460p' src/bernstein/cli/status.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 8313


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C4 'live_costs|spent_usd|total_cost_usd|summary"\] =|payload\["costs"\]' src/bernstein/core/routes/status_dashboard.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 3905


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C4 'live_costs|spent_usd|total_cost_usd|summary"\] =|payload\["costs"\]' src/bernstein/core/routes/status_dashboard.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 3905


Read the /status cost from summary.cost_usd or costs.spent_usd
src/bernstein/cli/run_confirm.py:459-464 only checks payload["total_cost_usd"], but /status exposes spend under summary.cost_usd and costs.spent_usd. That leaves the demo summary at $0.0000 when the top-level alias is absent.

Suggested fix
-            total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)
+            total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)
+            if total_cost <= 0.0:
+                summary = payload.get("summary", {})
+                if isinstance(summary, dict):
+                    total_cost = float(summary.get("cost_usd", 0.0) or 0.0)
+            if total_cost <= 0.0:
+                costs = payload.get("costs", {})
+                if isinstance(costs, dict):
+                    total_cost = float(costs.get("spent_usd", 0.0) or 0.0)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
raw_tasks = payload.get("tasks", [])
if isinstance(raw_tasks, dict):
raw_tasks = raw_tasks.get("items", [])
if isinstance(raw_tasks, list):
tasks_data = [t for t in raw_tasks if isinstance(t, dict)]
total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)
raw_tasks = payload.get("tasks", [])
if isinstance(raw_tasks, dict):
raw_tasks = raw_tasks.get("items", [])
if isinstance(raw_tasks, list):
tasks_data = [t for t in raw_tasks if isinstance(t, dict)]
total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)
if total_cost <= 0.0:
summary = payload.get("summary", {})
if isinstance(summary, dict):
total_cost = float(summary.get("cost_usd", 0.0) or 0.0)
if total_cost <= 0.0:
costs = payload.get("costs", {})
if isinstance(costs, dict):
total_cost = float(costs.get("spent_usd", 0.0) or 0.0)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/bernstein/cli/run_confirm.py` around lines 459 - 464, The cost extraction
in run_confirm.py’s payload handling only reads the top-level total_cost_usd
alias, so update the summary-building logic to also fall back to
summary.cost_usd and costs.spent_usd when computing total_cost. Keep the task
parsing flow intact, but adjust the total_cost assignment path in the same block
so the demo summary reflects the actual /status spend even when the alias is
missing.

Comment thread src/bernstein/core/agents/spawner_core.py
Comment on lines +1810 to +1811
adapter_name=self._adapter.name(),
adapter_default_model=getattr(self._adapter, "default_model", None),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
ast-grep outline src/bernstein/adapters/caching_adapter.py
rg -nP 'default_model|__getattr__' src/bernstein/adapters/caching_adapter.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 438


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '== caching_adapter outline =='
ast-grep outline src/bernstein/adapters/caching_adapter.py

echo
echo '== caching_adapter relevant lines =='
sed -n '1,220p' src/bernstein/adapters/caching_adapter.py

echo
echo '== spawner_core around the referenced lines =='
sed -n '1790,1825p' src/bernstein/core/agents/spawner_core.py

echo
echo '== search for default_model handling around adapters =='
rg -n 'default_model|__getattr__|CachingAdapter' src/bernstein/adapters src/bernstein/core/agents/spawner_core.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 11721


CachingAdapter needs to expose default_model. At src/bernstein/core/agents/spawner_core.py:1810-1811, getattr(self._adapter, "default_model", None) stays None when caching is enabled, because src/bernstein/adapters/caching_adapter.py only forwards name(), is_alive(), kill(), and detect_tier() and keeps the real adapter on self._inner. That leaves _coerce_model_for_non_claude_adapter without the adapter fallback model. Read it from self._inner or proxy default_model on CachingAdapter.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/bernstein/core/agents/spawner_core.py` around lines 1810 - 1811, The
caching path is missing the adapter fallback model because CachingAdapter does
not expose default_model, so spawner_core’s getattr(self._adapter,
"default_model", None) returns None. Update CachingAdapter in caching_adapter.py
to proxy default_model from self._inner (or otherwise expose it), so
_coerce_model_for_non_claude_adapter can read the real adapter’s fallback model
when caching is enabled.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/test_cli_demo.py`:
- Around line 53-57: The assertions in the demo CLI test are too broad and can
pass even when the “Bugs fixed” summary row is incorrect. Update the checks
around the rendered output in test_cli_demo to assert the full row pattern
directly from the relevant formatter output, using the existing buf.getvalue()
content and the summary row text produced by the CLI, so the test verifies the
exact “1 / 3” style row instead of matching isolated substrings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3e4cf9d-3b67-4cb6-b208-768d41683887

📥 Commits

Reviewing files that changed from the base of the PR and between b507986 and bed1d64.

📒 Files selected for processing (7)
  • src/bernstein/adapters/codex.py
  • src/bernstein/cli/run_confirm.py
  • src/bernstein/core/agents/spawner_core.py
  • src/bernstein/core/agents/spawner_warm_pool.py
  • tests/unit/test_adapter_codex.py
  • tests/unit/test_cli_demo.py
  • tests/unit/test_model_coercion.py

Comment on lines +53 to +57
out = buf.getvalue()
# 1 done out of 3 total, rendered without raising.
assert "1" in out
assert "/ 3" in out
assert "$0.5000" in out

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Strengthen summary assertions to avoid false positives.

Lines 55-56 can pass even if the “Bugs fixed” row is wrong. Assert the row pattern directly.

Suggested diff
+import re
@@
-    assert "1" in out
-    assert "/ 3" in out
+    assert re.search(r"Bugs fixed\s+1\s*/\s*3", out)
     assert "$0.5000" in out
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
out = buf.getvalue()
# 1 done out of 3 total, rendered without raising.
assert "1" in out
assert "/ 3" in out
assert "$0.5000" in out
import re
...
out = buf.getvalue()
# 1 done out of 3 total, rendered without raising.
assert re.search(r"Bugs fixed\s+1\s*/\s*3", out)
assert "$0.5000" in out
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_cli_demo.py` around lines 53 - 57, The assertions in the demo
CLI test are too broad and can pass even when the “Bugs fixed” summary row is
incorrect. Update the checks around the rendered output in test_cli_demo to
assert the full row pattern directly from the relevant formatter output, using
the existing buf.getvalue() content and the summary row text produced by the
CLI, so the test verifies the exact “1 / 3” style row instead of matching
isolated substrings.

@chernistry chernistry merged commit 3c6f8d8 into main Jun 25, 2026
96 of 97 checks passed
@chernistry chernistry deleted the fix/2075-codex-oauth-routing branch June 25, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The Codex adapter is unusable with a ChatGPT subscription (OAuth login)

1 participant