fix(codex): respect non-Claude model selection and OAuth, fix demo summary by chernistry · Pull Request #2086 · sipyourdrink-ltd/bernstein

chernistry · 2026-06-25T09:35:28Z

Summary

Fixes the three defects in #2075 that made the Codex adapter unusable with a ChatGPT OAuth login.

1. Model routing handed Claude tier names to Codex

The batch/heuristic selector emits opus/sonnet/haiku with no adapter awareness, so a high-stakes role (manager/architect/security) produced codex exec -m opus, which Codex rejects. The spawner now substitutes the adapter's default model for an unpinned Claude tier name when the run-level adapter is non-Claude, so the model recorded for the run matches what actually runs. The Claude path is byte-identical (gated on a non-Claude adapter). CodexAdapter also maps any residual tier name to its default as a last-resort net.

2. Spurious `OPENAI_API_KEY` warning despite OAuth

The adapter warned on every spawn when OPENAI_API_KEY was absent, even with a valid ChatGPT OAuth session. It now detects ~/.codex/auth.json (written by codex login) and only warns when neither an API key nor an OAuth session is present.

3. `bernstein demo --real` crash

_print_demo_summary read the /status tasks field as a list, but the endpoint returns {"count", "items"}. Iterating the dict yielded its string keys and raised AttributeError: 'str' object has no attribute 'get'. It now unwraps the items list and keeps only dict rows.

Tests

New tests/unit/test_model_coercion.py for the adapter-aware model coercion (Claude path unchanged).
New cases in tests/unit/test_adapter_codex.py: OAuth session suppresses the warning; a Claude tier name maps to the Codex default in argv.
New case in tests/unit/test_cli_demo.py: the demo summary handles the real /status shape without crashing (reproduces the reported AttributeError before the fix).
Local runs green across model-coercion, codex adapter, demo, router_core, warm_pool, cascade_router, non-claude adapter, registry, spawner, and adapter conformance suites (700+ tests). ruff check and ruff format clean.

Fixes #2075

Summary by Sourcery

Ensure Codex adapter works correctly with ChatGPT OAuth and non-Claude model selections, and prevent the demo status summary from crashing on the current API response shape.

Bug Fixes:

Prevent Codex spawns from failing or recording invalid models when the selector returns Claude tier names for non-Claude adapters.
Avoid emitting spurious OPENAI_API_KEY warnings when a valid Codex OAuth session is present.
Handle the /status tasks payload returned as a {count, items} dict so the demo summary no longer crashes.

Enhancements:

Introduce adapter-aware model coercion in the spawner so unpinned Claude tier names are mapped to the target adapter’s default model, keeping recorded and executed models aligned.
Expose a default_model on CodexAdapter and add a defensive model mapping helper to normalize any residual Claude tier names.

Tests:

Add unit tests for adapter-aware model coercion across Claude and non-Claude adapters.
Extend Codex adapter tests to cover OAuth-based authentication behavior and model mapping to the default Codex model.
Add a demo CLI test to verify the real /status response shape is summarized correctly without raising errors.

Summary by CodeRabbit

New Features
- Improved model selection: when a Claude-tier model name is chosen with a non-Claude adapter, it’s automatically normalized to the adapter’s compatible default.
- Enhanced Codex support by selecting the correct model and detecting either API-key or existing Codex OAuth credentials.
Bug Fixes
- Made demo status rendering resilient to multiple task payload shapes from the status endpoint.
Testing
- Added/expanded unit tests covering Codex credential warnings, worker model arguments, model coercion behavior, and the updated demo summary rendering.

sourcery-ai · 2026-06-25T09:35:39Z

Reviewer's Guide

Codex adapter and spawner now correctly handle Claude tier model names for non-Claude runs, respect ChatGPT OAuth-based Codex auth, and the demo summary is hardened against the real /status tasks payload shape, with tests added around these behaviors.

Sequence diagram for non-Claude model coercion and Codex OAuth-aware spawn

sequenceDiagram
    participant SpawnerCore as SpawnerCore
    participant BanditRouter as BanditRouter
    participant CodexAdapter as CodexAdapter
    participant Env as Env
    participant CodexCLI as CodexCLI

    SpawnerCore->>BanditRouter: router_applicable(adapter_name)
    BanditRouter-->>SpawnerCore: is_claude_compatible
    SpawnerCore->>SpawnerCore: _coerce_model_for_non_claude_adapter(model_config, adapter_name, adapter_default_model)
    SpawnerCore->>CodexAdapter: spawn(model_config)

    CodexAdapter->>Env: _has_codex_auth()
    Env-->>CodexAdapter: OPENAI_API_KEY or ~/.codex/auth.json
    CodexAdapter->>CodexAdapter: _codex_model(model_config.model)
    CodexAdapter->>CodexCLI: codex exec -m model
    CodexCLI-->>CodexAdapter: session result
    CodexAdapter-->>SpawnerCore: spawned session info

File-Level Changes

Change	Details	Files
Codex adapter now has explicit defaults, model coercion, and unified auth detection for API key vs OAuth.	Introduce _CODEX_AUTH_FILE constant and _has_codex_auth helper to treat either OPENAI_API_KEY or ~/.codex/auth.json as valid Codex credentials. Add _DEFAULT_CODEX_MODEL and _CLAUDE_TIER_MODELS plus _codex_model helper to map Claude tier names to a valid Codex model with a warning. Expose default_model on CodexAdapter so the spawner can substitute it when needed. Update spawn() to use _has_codex_auth for warnings and pass the coerced model through to the CLI command and metadata.	`src/bernstein/adapters/codex.py`
Spawner normalizes heuristic/batch-selected Claude tier names when a non-Claude adapter is used and no model is operator-pinned.	Introduce _CLAUDE_TIER_MODELS and _coerce_model_for_non_claude_adapter to replace unpinned Claude tier names with the adapter default unless the adapter is Claude-compatible or no default is known. Invoke _coerce_model_for_non_claude_adapter in _spawn_for_tasks_internal when provider_name is None and neither task nor role policy specifies a model, using the adapter's default_model attribute if present.	`src/bernstein/core/agents/spawner_warm_pool.py` `src/bernstein/core/agents/spawner_core.py`
Demo summary now correctly handles /status.tasks as a dict containing items and is robust to type variations.	Change _print_demo_summary to unwrap tasks from {"count", "items"} when tasks is a dict, tolerate a bare list, filter non-dict entries, and coerce total_cost_usd to a float safely. Add a regression test that mocks /status to return the dict-shaped tasks payload and asserts the summary renders without raising and shows the expected counts and cost.	`src/bernstein/cli/run_confirm.py` `tests/unit/test_cli_demo.py`
New unit tests cover model coercion for non-Claude adapters and Codex-specific behaviors around OAuth and model selection.	Add tests/unit/test_model_coercion.py to verify Claude tier replacement with adapter defaults, unchanged behavior for Claude adapters, pass-through for non-tier models, and no-default behavior. Extend tests/unit/test_adapter_codex.py to: rename and refine the missing-auth warning test, add a no-warning-with-OAuth-session case by patching _CODEX_AUTH_FILE, and assert that a Claude tier name reaching CodexAdapter.spawn is mapped to gpt-5.4 in argv.	`tests/unit/test_model_coercion.py` `tests/unit/test_adapter_codex.py`

Assessment against linked issues

Issue	Objective	Addressed
#2075	Ensure that when the Codex adapter is used, Claude-specific tier names (opus/sonnet/haiku) are not passed as the model to `codex exec`, and instead a Codex-compatible model is used (respecting user configuration or falling back to a sensible Codex default).	✅
#2075	Update the Codex adapter’s authentication handling so that it does not warn about a missing OPENAI_API_KEY when a valid ChatGPT OAuth session exists (as indicated by `~/.codex/auth.json`).	✅
#2075	Fix the `bernstein demo --real` crash at the summary stage by correctly handling the `/status` response shape for tasks and avoiding the AttributeError during summary rendering.	✅

Possibly linked issues

The Codex adapter is unusable with a ChatGPT subscription (OAuth login) #2075: PR adjusts Codex model routing, OAuth detection, and /status tasks parsing, resolving all failures reported in the issue.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

github-actions · 2026-06-25T09:35:43Z

Sonar insights (advisory, no merge-block)

Snapshot of bernstein on the configured Sonar instance:

Metric	Value
Coverage	80.1
Code smells	0
Bugs	0
Vulnerabilities	0
Security hotspots	0

Run bernstein doctor sonar locally for the full surface.

This comment is a soft signal. The Sonar scan runs on push to main; the PR check itself never fails on smells.

coderabbitai · 2026-06-25T09:35:47Z

📝 Walkthrough

Walkthrough

The PR updates Codex model selection to detect OAuth auth, remap Claude-tier model names for non-Claude adapters, and coerce spawned models to adapter defaults when needed. It also makes demo summary parsing accept multiple /status task shapes.

Changes

Codex model selection

Layer / File(s)	Summary
Codex auth and model helpers `src/bernstein/adapters/codex.py`	Codex adapter imports `Path`, defines auth/model helpers, and adds a `default_model` class attribute.
Codex spawn model mapping `src/bernstein/adapters/codex.py`, `tests/unit/test_adapter_codex.py`	Codex `spawn` warns when both API-key and OAuth-session checks fail, maps the requested model before building the worker command, and passes the mapped model into `build_worker_cmd`; the adapter tests cover the warning and mapping behavior.
Non-Claude model coercion `src/bernstein/core/agents/spawner_warm_pool.py`, `src/bernstein/core/agents/spawner_core.py`, `tests/unit/test_model_coercion.py`	The warm-pool helper rewrites Claude-tier model names to adapter defaults for non-Claude adapters, and `AgentSpawner._spawn_for_tasks_internal` applies that coercion when no model is pinned; the coercion tests cover the helper behavior.

Demo summary parsing

Layer / File(s)	Summary
Status task parsing `src/bernstein/cli/run_confirm.py`, `tests/unit/test_cli_demo.py`	`_print_demo_summary` handles `tasks` as either a dict with `items` or a bare list, filters dict task rows, and the regression test covers the dict-shaped payload.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 58.82% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title matches the main changes: Codex model routing/OAuth handling plus the demo summary fix.
Description check	✅ Passed	The description covers what, why, how, and tests well enough, even though the checklist section is not filled out.
Linked Issues check	✅ Passed	The changes address `#2075` by fixing Codex model coercion, OAuth credential detection, and the demo summary crash.
Out of Scope Changes check	✅ Passed	The changes stay focused on the Codex/OAuth and demo crash fixes, with supporting tests and no obvious unrelated edits.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/2075-codex-oauth-routing

_{Comment @coderabbitai help to get the list of available commands.}

…mmary (#2075) Three defects reported in #2075 made the Codex adapter unusable with a ChatGPT OAuth login. 1. Model routing handed Claude tier names to Codex. The batch/heuristic selector emits opus/sonnet/haiku with no adapter awareness, so a high-stakes role (manager/architect/security) produced `codex exec -m opus`, which Codex rejects. The spawner now substitutes the adapter's default model for an unpinned Claude tier name when the run-level adapter is non-Claude, so the model recorded for the run matches what actually runs. The Claude path is unchanged. CodexAdapter also maps any residual tier name to its default as a last-resort net. 2. Spurious OPENAI_API_KEY warning. The adapter warned on every spawn when the env var was absent, even with a valid ChatGPT OAuth session. It now detects ~/.codex/auth.json (written by `codex login`) and only warns when neither an API key nor an OAuth session is present. 3. `bernstein demo --real` crash. _print_demo_summary read the /status `tasks` field as a list, but the endpoint returns {"count", "items"}. Iterating the dict yielded its string keys and raised AttributeError on `.get`. It now unwraps the items list and keeps only dict rows. Fixes #2075

github-actions · 2026-06-25T09:36:34Z

Review-bot acknowledgement summary

Must-address findings: 0 (0 acknowledged, 0 open)
Informational findings: 5

All must-address findings are resolved or acknowledged.

github-actions · 2026-06-25T09:36:35Z

bernstein doctor observe for PR #2086 (fix/2075-codex-oauth-routing): ok=1, warn=0, fail=1, error=0, skipped=2

sonar -- OK (project bernstein)

metric	value	delta	threshold	status
coverage_pct	80.1%	new	80.0%	ok
code_smells	0	new	50	ok
bugs	0	new	0	ok
vulnerabilities	0	new	0	ok
security_hotspots	0	new	0	ok

code-scanning -- FAIL (39 open alert(s))

metric	value	delta	threshold	status
open_alerts	39	new	0	fail
critical_alerts	1	new	0	fail
high_alerts	18	new	0	fail
medium_alerts	3	new	-	ok
low_alerts	0	new	-	ok

Skipped backends (credentials not configured)

glitchtip: BERNSTEIN_GLITCHTIP_TOKEN not set
dt: DTRACK_URL/TOKEN/PROJECT not set

See docs/observability/unified-doctor.md for backend setup notes.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The Claude tier model set is now duplicated in both bernstein.adapters.codex and spawner_warm_pool; consider centralizing _CLAUDE_TIER_MODELS to avoid divergence if tier names change.
The Codex default model string "gpt-5.4" is hardcoded in the adapter and tests; it may be safer to expose this as a single configurable constant or setting so changes to the default don’t require code edits in multiple places.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The Claude tier model set is now duplicated in both `bernstein.adapters.codex` and `spawner_warm_pool`; consider centralizing `_CLAUDE_TIER_MODELS` to avoid divergence if tier names change.
- The Codex default model string `"gpt-5.4"` is hardcoded in the adapter and tests; it may be safer to expose this as a single configurable constant or setting so changes to the default don’t require code edits in multiple places.

## Individual Comments

### Comment 1
<location path="src/bernstein/core/agents/spawner_warm_pool.py" line_range="93-97" />
<code_context>
+        return model_config
+    if not adapter_default_model:
+        return model_config
+    return ModelConfig(
+        model=adapter_default_model,
+        effort=model_config.effort,
+        max_tokens=model_config.max_tokens,
+        is_batch=model_config.is_batch,
+    )
+
</code_context>
<issue_to_address>
**suggestion:** Consider constructing the coerced ModelConfig in a way that preserves future fields automatically.

Reconstructing `ModelConfig` with a hardcoded subset of fields will drop any new or optional attributes added later. If `ModelConfig` is a dataclass, consider using `dataclasses.replace(model_config, model=adapter_default_model)` (or an equivalent pattern) so only `model` changes and other fields are preserved automatically.

Suggested implementation:

```python
    if not adapter_default_model:
        return model_config
    # Preserve all existing and future fields on ModelConfig, only overriding `model`
    return dataclasses.replace(model_config, model=adapter_default_model)

```

To support `dataclasses.replace`, ensure this file imports `dataclasses` (or `replace` directly) near the top, for example:
- `import dataclasses`
or
- `from dataclasses import replace` and then use `replace(model_config, model=adapter_default_model)` instead of `dataclasses.replace(...)`.

If `ModelConfig` is not a dataclass but a Pydantic model or similar, replace the final line with an appropriate copy/update method, e.g.:
- `return model_config.model_copy(update={"model": adapter_default_model})`
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-06-25T09:37:22Z

+    return ModelConfig(
+        model=adapter_default_model,
+        effort=model_config.effort,
+        max_tokens=model_config.max_tokens,
+        is_batch=model_config.is_batch,


suggestion: Consider constructing the coerced ModelConfig in a way that preserves future fields automatically.

Reconstructing ModelConfig with a hardcoded subset of fields will drop any new or optional attributes added later. If ModelConfig is a dataclass, consider using dataclasses.replace(model_config, model=adapter_default_model) (or an equivalent pattern) so only model changes and other fields are preserved automatically.

Suggested implementation:

if not adapter_default_model: return model_config # Preserve all existing and future fields on ModelConfig, only overriding `model` return dataclasses.replace(model_config, model=adapter_default_model)

To support dataclasses.replace, ensure this file imports dataclasses (or replace directly) near the top, for example:

import dataclasses
or

from dataclasses import replace and then use replace(model_config, model=adapter_default_model) instead of dataclasses.replace(...).

If ModelConfig is not a dataclass but a Pydantic model or similar, replace the final line with an appropriate copy/update method, e.g.:

return model_config.model_copy(update={"model": adapter_default_model})

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/bernstein/cli/run_confirm.py`:
- Around line 459-464: The cost extraction in run_confirm.py’s payload handling
only reads the top-level total_cost_usd alias, so update the summary-building
logic to also fall back to summary.cost_usd and costs.spent_usd when computing
total_cost. Keep the task parsing flow intact, but adjust the total_cost
assignment path in the same block so the demo summary reflects the actual
/status spend even when the alias is missing.

In `@src/bernstein/core/agents/spawner_core.py`:
- Around line 1807-1812: The fallback in spawner_core.py only runs when
provider_name is None, so non-Claude providers like codex keep the Claude model
selection recorded in session.model_config and the initial trace. Update the
guard around _coerce_model_for_non_claude_adapter in the agent spawner flow so
it also applies when _resolve_routing() selects a non-Claude provider_name,
ensuring the recorded model matches the provider’s actual default before tracing
or persisting selection.
- Around line 1810-1811: The caching path is missing the adapter fallback model
because CachingAdapter does not expose default_model, so spawner_core’s
getattr(self._adapter, "default_model", None) returns None. Update
CachingAdapter in caching_adapter.py to proxy default_model from self._inner (or
otherwise expose it), so _coerce_model_for_non_claude_adapter can read the real
adapter’s fallback model when caching is enabled.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 663ec9a0-8eb3-4860-9998-0ec05f8d994b

📥 Commits

Reviewing files that changed from the base of the PR and between 11798c3 and b507986.

📒 Files selected for processing (7)

src/bernstein/adapters/codex.py
src/bernstein/cli/run_confirm.py
src/bernstein/core/agents/spawner_core.py
src/bernstein/core/agents/spawner_warm_pool.py
tests/unit/test_adapter_codex.py
tests/unit/test_cli_demo.py
tests/unit/test_model_coercion.py

coderabbitai · 2026-06-25T09:41:18Z

+            raw_tasks = payload.get("tasks", [])
+            if isinstance(raw_tasks, dict):
+                raw_tasks = raw_tasks.get("items", [])
+            if isinstance(raw_tasks, list):
+                tasks_data = [t for t in raw_tasks if isinstance(t, dict)]
+            total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)


🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Look for where total_cost_usd is set on the /status payload and how status.py derives spend. rg -nP -C3 '\btotal_cost_usd\b' src/bernstein/core/routes/ rg -nP -C3 'cost_usd|spent_usd' src/bernstein/core/routes/status_dashboard.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 15089

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '820,860p' src/bernstein/core/routes/status_dashboard.py printf '\n---\n' sed -n '1,260p' src/bernstein/cli/status.py printf '\n---\n' sed -n '440,490p' src/bernstein/cli/run_confirm.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 12828

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '820,860p' src/bernstein/core/routes/status_dashboard.py printf '\n---\n' sed -n '1,260p' src/bernstein/cli/status.py printf '\n---\n' sed -n '440,490p' src/bernstein/cli/run_confirm.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 12828

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '820,860p' src/bernstein/core/routes/status_dashboard.py printf '\n---\n' sed -n '1,260p' src/bernstein/cli/status.py printf '\n---\n' sed -n '440,490p' src/bernstein/cli/run_confirm.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 12828

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "def _extract_spent_cost|_extract_spent_cost\(" src/bernstein/cli/status.py src/bernstein/cli/run_confirm.py printf '\n---\n' sed -n '260,360p' src/bernstein/cli/status.py printf '\n---\n' sed -n '360,460p' src/bernstein/cli/status.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 8313

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "def _extract_spent_cost|_extract_spent_cost\(" src/bernstein/cli/status.py src/bernstein/cli/run_confirm.py printf '\n---\n' sed -n '260,360p' src/bernstein/cli/status.py printf '\n---\n' sed -n '360,460p' src/bernstein/cli/status.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 8313

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n -C4 'live_costs|spent_usd|total_cost_usd|summary"\] =|payload\["costs"\]' src/bernstein/core/routes/status_dashboard.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 3905

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n -C4 'live_costs|spent_usd|total_cost_usd|summary"\] =|payload\["costs"\]' src/bernstein/core/routes/status_dashboard.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 3905

Read the /status cost from summary.cost_usd or costs.spent_usd
src/bernstein/cli/run_confirm.py:459-464 only checks payload["total_cost_usd"], but /status exposes spend under summary.cost_usd and costs.spent_usd. That leaves the demo summary at $0.0000 when the top-level alias is absent.

Suggested fix

- total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0) + total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0) + if total_cost <= 0.0: + summary = payload.get("summary", {}) + if isinstance(summary, dict): + total_cost = float(summary.get("cost_usd", 0.0) or 0.0) + if total_cost <= 0.0: + costs = payload.get("costs", {}) + if isinstance(costs, dict): + total_cost = float(costs.get("spent_usd", 0.0) or 0.0)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raw_tasks = payload.get("tasks", [])

if isinstance(raw_tasks, dict):

raw_tasks = raw_tasks.get("items", [])

if isinstance(raw_tasks, list):

tasks_data = [t for t in raw_tasks if isinstance(t, dict)]

total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)

raw_tasks = payload.get("tasks", [])

if isinstance(raw_tasks, dict):

raw_tasks = raw_tasks.get("items", [])

if isinstance(raw_tasks, list):

tasks_data = [t for t in raw_tasks if isinstance(t, dict)]

total_cost = float(payload.get("total_cost_usd", 0.0) or 0.0)

if total_cost <= 0.0:

summary = payload.get("summary", {})

if isinstance(summary, dict):

total_cost = float(summary.get("cost_usd", 0.0) or 0.0)

if total_cost <= 0.0:

costs = payload.get("costs", {})

if isinstance(costs, dict):

total_cost = float(costs.get("spent_usd", 0.0) or 0.0)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/bernstein/cli/run_confirm.py` around lines 459 - 464, The cost extraction in run_confirm.py’s payload handling only reads the top-level total_cost_usd alias, so update the summary-building logic to also fall back to summary.cost_usd and costs.spent_usd when computing total_cost. Keep the task parsing flow intact, but adjust the total_cost assignment path in the same block so the demo summary reflects the actual /status spend even when the alias is missing.

coderabbitai · 2026-06-25T09:41:18Z

+                adapter_name=self._adapter.name(),
+                adapter_default_model=getattr(self._adapter, "default_model", None),


🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash ast-grep outline src/bernstein/adapters/caching_adapter.py rg -nP 'default_model|__getattr__' src/bernstein/adapters/caching_adapter.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 438

🏁 Script executed:

#!/bin/bash set -euo pipefail echo '== caching_adapter outline ==' ast-grep outline src/bernstein/adapters/caching_adapter.py echo echo '== caching_adapter relevant lines ==' sed -n '1,220p' src/bernstein/adapters/caching_adapter.py echo echo '== spawner_core around the referenced lines ==' sed -n '1790,1825p' src/bernstein/core/agents/spawner_core.py echo echo '== search for default_model handling around adapters ==' rg -n 'default_model|__getattr__|CachingAdapter' src/bernstein/adapters src/bernstein/core/agents/spawner_core.py

Repository: sipyourdrink-ltd/bernstein

Length of output: 11721

CachingAdapter needs to expose default_model. At src/bernstein/core/agents/spawner_core.py:1810-1811, getattr(self._adapter, "default_model", None) stays None when caching is enabled, because src/bernstein/adapters/caching_adapter.py only forwards name(), is_alive(), kill(), and detect_tier() and keeps the real adapter on self._inner. That leaves _coerce_model_for_non_claude_adapter without the adapter fallback model. Read it from self._inner or proxy default_model on CachingAdapter.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/bernstein/core/agents/spawner_core.py` around lines 1810 - 1811, The caching path is missing the adapter fallback model because CachingAdapter does not expose default_model, so spawner_core’s getattr(self._adapter, "default_model", None) returns None. Update CachingAdapter in caching_adapter.py to proxy default_model from self._inner (or otherwise expose it), so _coerce_model_for_non_claude_adapter can read the real adapter’s fallback model when caching is enabled.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/test_cli_demo.py`:
- Around line 53-57: The assertions in the demo CLI test are too broad and can
pass even when the “Bugs fixed” summary row is incorrect. Update the checks
around the rendered output in test_cli_demo to assert the full row pattern
directly from the relevant formatter output, using the existing buf.getvalue()
content and the summary row text produced by the CLI, so the test verifies the
exact “1 / 3” style row instead of matching isolated substrings.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3e4cf9d-3b67-4cb6-b208-768d41683887

📥 Commits

Reviewing files that changed from the base of the PR and between b507986 and bed1d64.

📒 Files selected for processing (7)

src/bernstein/adapters/codex.py
src/bernstein/cli/run_confirm.py
src/bernstein/core/agents/spawner_core.py
src/bernstein/core/agents/spawner_warm_pool.py
tests/unit/test_adapter_codex.py
tests/unit/test_cli_demo.py
tests/unit/test_model_coercion.py

coderabbitai · 2026-06-25T09:45:35Z

+    out = buf.getvalue()
+    # 1 done out of 3 total, rendered without raising.
+    assert "1" in out
+    assert "/ 3" in out
+    assert "$0.5000" in out


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Strengthen summary assertions to avoid false positives.

Lines 55-56 can pass even if the “Bugs fixed” row is wrong. Assert the row pattern directly.

Suggested diff

+import re @@ - assert "1" in out - assert "/ 3" in out + assert re.search(r"Bugs fixed\s+1\s*/\s*3", out) assert "$0.5000" in out

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

out = buf.getvalue()

# 1 done out of 3 total, rendered without raising.

assert "1" in out

assert "/ 3" in out

assert "$0.5000" in out

import re

...

out = buf.getvalue()

# 1 done out of 3 total, rendered without raising.

assert re.search(r"Bugs fixed\s+1\s*/\s*3", out)

assert "$0.5000" in out

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/test_cli_demo.py` around lines 53 - 57, The assertions in the demo CLI test are too broad and can pass even when the “Bugs fixed” summary row is incorrect. Update the checks around the rendered output in test_cli_demo to assert the full row pattern directly from the relevant formatter output, using the existing buf.getvalue() content and the summary row text produced by the CLI, so the test verifies the exact “1 / 3” style row instead of matching isolated substrings.

github-actions Bot added core cli tests adapters labels Jun 25, 2026

github-actions Bot added the size/m label Jun 25, 2026

sourcery-ai Bot reviewed Jun 25, 2026

View reviewed changes

chernistry force-pushed the fix/2075-codex-oauth-routing branch from b507986 to bed1d64 Compare June 25, 2026 09:38

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

chernistry merged commit 3c6f8d8 into main Jun 25, 2026
96 of 97 checks passed

chernistry deleted the fix/2075-codex-oauth-routing branch June 25, 2026 09:50

This was referenced Jun 25, 2026

chore(release): 2.7.1 #2090

Closed

chore(release): 2.8.0 #2095

Merged

chore(release): 2.8.1 #2097

Merged

		adapter_name=self._adapter.name(),
		adapter_default_model=getattr(self._adapter, "default_model", None),

Uh oh!

Uh oh!

Conversation

chernistry commented Jun 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Model routing handed Claude tier names to Codex

2. Spurious OPENAI_API_KEY warning despite OAuth

3. bernstein demo --real crash

Tests

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for non-Claude model coercion and Codex OAuth-aware spawn

File-Level Changes

Assessment against linked issues

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions Bot commented Jun 25, 2026

Sonar insights (advisory, no merge-block)

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review-bot acknowledgement summary

Uh oh!

github-actions Bot commented Jun 25, 2026

sonar -- OK (project bernstein)

code-scanning -- FAIL (39 open alert(s))

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chernistry commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

2. Spurious `OPENAI_API_KEY` warning despite OAuth

3. `bernstein demo --real` crash

sourcery-ai Bot commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

github-actions Bot commented Jun 25, 2026 •

edited

Loading