fix(acp): require tokens key before selecting chatgpt auth method#3628
Conversation
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
||||||||||||||||||||
When codex-acp runs in apikey mode it rewrites $CODEX_HOME/auth.json with
{"auth_mode": "apikey", "OPENAI_API_KEY": "..."}. On the next launch
_select_auth_method saw the file and returned 'chatgpt' — but the payload
is in apikey format, so codex hung forever waiting for browser-based OAuth.
Gate chatgpt selection on the presence of the 'tokens' key (the ChatGPT
subscription token blob) rather than mere file existence.
Co-authored-by: openhands <openhands@all-hands.dev>
42e8d04 to
4e17202
Compare
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: fix(acp): require tokens key before selecting chatgpt auth method
Taste Rating: 🟢 Good taste
Clean, well-scoped fix that solves a real bug without over-engineering.
Summary
This PR fixes a bug (issue #3627) where codex-acp would incorrectly select chatgpt auth method after detecting the presence of auth.json, even when the file was written in apikey format by codex itself. This caused the agent to hang indefinitely waiting for browser-based OAuth.
Analysis
[openhands-sdk/openhands/sdk/agent/acp_agent.py]
The change is minimal and focused:
-
_codex_auth_file_is_chatgpt()(lines 287-308) - New helper that validates the auth file contains the ChatGPT subscription marker (tokenskey) before claiming it's usable forchatgptauth. The docstring clearly explains the root cause and whytokensis the distinguishing key. -
_select_auth_method()(line 332) - One-line change to use the stricter validation instead of file presence check.
[tests/sdk/agent/test_acp_agent.py]
- Added
_CHATGPT_AUTH_JSONconstant with proper fixture data containingtokenskey - Updated three existing tests to use the proper chatgpt-format payload instead of
{} - Added two regression tests:
test_apikey_format_auth_file_falls_back_to_api_key- reproduces the exact bugtest_malformed_auth_file_falls_back_to_api_key- defensive coverage for unreadable/non-JSON files
Verification
- The fix targets the exact failure mode described in the issue
- The existing tests that were silently passing with
{}are now fixed to use proper fixtures - Two new tests explicitly cover the bug scenario and defensive edge cases
- The production cloud path is unaffected (canvas always sends
CODEX_AUTH_JSONwith proper format)
[RISK ASSESSMENT]
- Overall PR
⚠️ Risk Assessment: 🟢 LOW - The change is contained to a single file pair (implementation + tests)
- No API changes or breaking changes
- Tests are real behavioral tests, not just mock assertions
- Well-documented with clear references to the original issue
VERDICT: ✅ Worth merging
KEY INSIGHT: The original bug had a subtle race condition where a valid file format (apikey) was being misidentified due to only checking file presence. The fix correctly validates the actual content marker (tokens key) rather than just existence.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The ACP auth selector now falls back to API-key auth for Codex-written apikey/malformed auth.json files while preserving ChatGPT-token selection.
Does this PR achieve its stated goal?
Yes. I reproduced the PR's stated failure mode against the exact PR base commit (cabe776b): with CODEX_HOME/auth.json containing Codex's apikey rewrite, the SDK selected chatgpt. Re-running the same SDK call at PR head (4e17202a) selected openai-api-key for apikey-format and malformed files, while a real ChatGPT token blob still selected chatgpt.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv environment. |
| CI Status | PR Description Check), 2 in progress, 17 skipped. |
| Functional Verification | ✅ Before/after SDK execution confirms the auth-method selection behavior changed as intended. |
Functional Verification
Test 1: Codex apikey rewrite should not force ChatGPT auth after restart
The QA script imported the SDK auth selector, advertised chatgpt and openai-api-key auth methods, then created temporary CODEX_HOME/auth.json files for three realistic cases: Codex's apikey rewrite, malformed JSON, and a ChatGPT token blob.
Step 1 — Reproduce baseline without the fix:
Ran git checkout cabe776b404a729770b903923c8c2b5d935d2597 && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_auth_qa.py:
commit: cabe776b
codex apikey rewrite: chatgpt
malformed auth file: chatgpt
chatgpt token blob: chatgpt
baseline exit: 0
This confirms the bug on the PR base: any present auth.json, including Codex's apikey-format file, caused chatgpt selection.
Step 2 — Apply the PR's changes:
Checked out PR head 4e17202a131eac18a31c5ad4e89feb73d49c5574.
Step 3 — Re-run with the fix in place:
Ran git checkout 4e17202a131eac18a31c5ad4e89feb73d49c5574 && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_auth_qa.py:
commit: 4e17202a
codex apikey rewrite: openai-api-key
malformed auth file: openai-api-key
chatgpt token blob: chatgpt
pr exit: 0
This shows the fix works for the reported restart scenario: apikey-format and malformed files no longer trigger ChatGPT OAuth, and the intended ChatGPT-token path still works.
Unable to Verify
I did not run a live codex-acp conversation against real OpenAI/Codex credentials; no such credentials or browser OAuth session were available in this QA environment. The exercised behavior is the SDK auth-selection gate that determines whether codex-acp would enter the problematic ChatGPT OAuth path.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
HUMAN:
When acp_isolate_data_dir=True and codex-acp is driven with API-key auth
(OPENAI_API_KEY only, no CODEX_AUTH_JSON), the first session/new
succeeds, and during that session codex rewrites $CODEX_HOME/auth.json
with {"auth_mode": "apikey", "OPENAI_API_KEY": "..."}.
On the next launch (pod recycle / agent-server restart) _select_auth_method
in acp_agent.py saw the file at _codex_auth_file(env) and returned
"chatgpt", because the check only verified file presence:
if "chatgpt" in method_ids and _codex_auth_file(env).is_file():
return "chatgpt"
AGENT:
Why
When
acp_isolate_data_dir=Trueand codex-acp is driven with API-key auth(
OPENAI_API_KEYonly, noCODEX_AUTH_JSON), the firstsession/newsucceeds, and during that session codex rewrites
$CODEX_HOME/auth.jsonwith
{"auth_mode": "apikey", "OPENAI_API_KEY": "..."}.On the next launch (pod recycle / agent-server restart)
_select_auth_methodin
acp_agent.pysaw the file at_codex_auth_file(env)and returned"chatgpt", because the check only verified file presence:codex-acp then read the apikey-format file as if it were a ChatGPT token
blob and hung indefinitely waiting for browser-based OAuth.
The production cloud path is unaffected — canvas always sends
CODEX_AUTH_JSONso the SDK materialises a proper chatgpt-formatauth.jsonbefore codex starts and codex never overwrites it. The bugonly manifested in test harnesses / non-standard deployments running with
API-key auth and
acp_isolate_data_dir=True(discovered during the ACPcloud pivot local validation, canvas#1290).
Summary
_codex_auth_file_is_chatgpt(env)which parsesauth.jsonandrequires a top-level
"tokens"key — the marker that distinguishesthe ChatGPT subscription token blob from the apikey-mode file codex
writes for itself.
chatgptbranch in_select_auth_methodon this strictercheck so an apikey-format file falls through to the
openai-api-key/codex-api-keyfallback instead of triggeringthe OAuth hang.
{}asauth.jsonto use thereal chatgpt-format payload, and add two new tests:
test_apikey_format_auth_file_falls_back_to_api_key(reproduces thebug from fix(acp): _select_auth_method picks chatgpt when $CODEX_HOME/auth.json is in apikey format, causing auth hang on resume #3627) and
test_malformed_auth_file_falls_back_to_api_key(defensive coverage for unreadable / non-JSON files).
Issue Number
Fixes #3627
How to Test
Result:
20 passed, including the two new tests(
test_apikey_format_auth_file_falls_back_to_api_key,test_malformed_auth_file_falls_back_to_api_key) that fail onmainand pass with this change.
End-to-end repro (matches the issue's reproduction sequence):
ACPAgentconversation withacp_isolate_data_dir=True,OPENAI_API_KEYin secrets (noCODEX_AUTH_JSON).openai-api-key. codex-acp writes$CODEX_HOME/auth.jsonwith{"auth_mode": "apikey", ...}._select_auth_methodreturns
"openai-api-key"(file does not contain"tokens")instead of
"chatgpt", and the conversation proceeds withouthanging on browser-based OAuth.
Video/Screenshots
N/A — auth-path selection change, exercised through the new unit tests.
Type
Notes
CODEX_AUTH_JSONmaterialises a proper chatgpt-format file with
"tokens").dicts containing"tokens"; malformed ornon-dict JSON also falls back to the API-key path, which is the safe
default.
This PR was created by an AI agent (OpenHands) on behalf of the user.
@simonrosenberg can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:4e17202-pythonRun
All tags pushed for this build
About Multi-Architecture Support
4e17202-python) is a multi-arch manifest supporting both amd64 and arm644e17202-python-amd64) are also available if needed