ci(codex): post review as inline diff comments by leehanchung · Pull Request #78 · leehanchung/SMILE-factory

leehanchung · 2026-05-06T04:34:42Z

Summary

Phase 2 of the codex-reviewer parity work. Stacked on #76.

Until now Codex was posting a single review with all findings concatenated into the body. Claude (via the subagent architecture) anchors each finding to the specific diff line in the Files Changed tab. This change brings Codex to format parity:

Prompt now asks Codex for a single JSON object with event, body, and an array of comments (path / line / side / body per finding). Schema is documented inline so the model has no ambiguity.
Submit step parses the JSON with jq, builds the Reviews API payload in jq (so review text never round-trips through shell quoting), and posts via gh api --input <file>. We deliberately do NOT use -f event=... flags because gh api --input routes those into the query string, not the body — codex itself flagged that exact mistake in PR ci: add Codex reviewer and migrate Claude review to subagent architecture #61's review of the slash-command migration.
Defensive cleanup: strip leading/trailing ```json fences before parsing (codex sometimes wraps JSON in a fence despite the explicit instruction); validate the event value and fail loudly if it's outside APPROVE / COMMENT / REQUEST_CHANGES.

Once this lands, Codex reviews show up inline in the Files Changed tab next to the relevant lines, exactly like Claude's.

Test plan

CI on this PR posts a review with at least one inline comment anchored to a specific line in .github/workflows/codex-code-review.yml (or summary-only if Codex thinks the change is clean)
No 422 from the Reviews API (would surface as a workflow failure)

Stacking note

Base branch is ci/codex-app-token (PR #76). Once #76 merges, this PR's base auto-rebases to main and CI re-runs.

🤖 Generated with Claude Code

codex-code-reviewer

The inline review flow is generally well structured, but the submit step still trusts model-produced comment coordinates without validating them before calling the Reviews API, so a single bad line/path can fail the workflow with a 422 despite the prompt warning the model to be precise.

codex-code-reviewer · 2026-05-06T04:36:01Z

+          # text never round-trips through shell quoting. `--input`
+          # below sends the file as the request body; we do NOT pass
+          # `-f event=...` flags because `gh api --input` puts those
+          # in the query string instead of the body (codex itself


This forwards .comments directly into the Reviews API without validating that each entry has a diff path and a valid LEFT/RIGHT line. Since model output is not deterministic, one hallucinated coordinate will still produce the 422 this change is trying to avoid; consider validating comment shape/coordinates or dropping invalid comments before posting.

claude

Review summary

Minor observation: one new code path logs raw model output to the runner log on JSON-parse failure, which creates an exfiltration surface for sensitive on-runner files if Codex is prompt-injected; everything else looks correct.

Deployment

No changes to apps/delulu_discord or apps/delulu_sandbox_modal — no rebuild or redeploy needed.

claude · 2026-05-06T04:40:17Z

+
+          if ! echo "$CLEAN" | jq empty 2>/dev/null; then
+            echo "::error::Codex output is not valid JSON. Raw output:"
+            printf '%s\n' "$RAW"


Security: printf '%s\n' "$RAW" logs the full raw Codex output to the runner log on any JSON-parse failure. This is a new code path with no equivalent in the old code. The runner has $CODEX_HOME/auth.json and ~/.ssh/codex_droplet on disk during the Codex step; a crafted PR body could instruct Codex to embed the contents of those files in its response and then append non-JSON text to force this error path, publishing the secrets to the runner log (readable by any repo contributor). Consider replacing this with a byte-count message pointing readers at the run_codex step output directly:

echo "::error::Codex output is not valid JSON ($(wc -c < "$CODEX_OUTPUT_FILE") bytes); inspect the run_codex step logs."

Two-line conceptual change to the codex review pipeline: 1. Codex prompt now emits a structured JSON object (`event` + `body` + `comments[]`) instead of free-form prose ending in a `VERDICT:` line. Schema spelled out in the prompt so the model can self-check before emitting. 2. Submit step parses that JSON and posts via the GitHub Reviews API with `--input <file>` so the body and comment text never round-trip through shell quoting. Each `comments[]` entry becomes an inline diff comment anchored to a specific path:line, matching how anthropics/claude-code-action surfaces. Empty `comments` array still posts a clean top-level review. Defensive cleanup: codex sometimes wraps the JSON in a ```json fence despite the prompt saying not to. Strip leading/trailing fences before `jq empty` validation so a known model quirk doesn't fail the workflow. Event-value sanity check before posting (`APPROVE` / `REQUEST_CHANGES` / `COMMENT`) so a freelanced value surfaces as a clear error instead of an opaque GitHub 422. This was previously PR #78 with conflicts against #80's auto-resolve work; the rebase combines both — the new submit step keeps `id: submit` so the Resolve step's gate (`steps.submit.conclusion == 'success'`) still fires after inline-comment reviews. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codex-code-reviewer

The inline review flow is close, but the submit step still has two reliability/security edges: it trusts model-supplied inline comment coordinates all the way into the Reviews API, and it logs raw model output on parse failure.

codex-code-reviewer · 2026-05-06T05:37:19Z

+          if ! echo "$CLEAN" | jq empty 2>/dev/null; then
+            echo "::error::Codex output is not valid JSON. Raw output:"
+            printf '%s\n' "$RAW"
+            exit 1


Logging the full raw Codex output on parse failure can expose anything the model was induced to read or include in its response. Prefer logging a size/error summary and avoid echoing model-controlled content into workflow logs.

codex-code-reviewer · 2026-05-06T05:37:19Z

+          echo "$CLEAN" \
+            | jq --arg sha "${{ github.event.pull_request.head.sha }}" \
+                '{commit_id: $sha, event: .event, body: .body, comments: (.comments // [])}' \
+            > "$PAYLOAD"


This forwards .comments without validating path, side, line, or body shape. A single hallucinated coordinate will still make the Reviews API return 422, so this needs schema validation or invalid-comment filtering before posting.

codex-code-reviewer Bot reviewed May 6, 2026

View reviewed changes

claude Bot reviewed May 6, 2026

View reviewed changes

Base automatically changed from ci/codex-app-token to main May 6, 2026 05:11

leehanchung force-pushed the ci/codex-inline-comments branch from 4b870e3 to 00acb18 Compare May 6, 2026 05:34

codex-code-reviewer Bot reviewed May 6, 2026

View reviewed changes

Merge branch 'main' into ci/codex-inline-comments

371bc14

leehanchung merged commit 7749712 into main May 6, 2026
2 of 3 checks passed

leehanchung deleted the ci/codex-inline-comments branch May 6, 2026 05:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(codex): post review as inline diff comments#78

ci(codex): post review as inline diff comments#78
leehanchung merged 2 commits into
mainfrom
ci/codex-inline-comments

leehanchung commented May 6, 2026

Uh oh!

codex-code-reviewer Bot left a comment

Uh oh!

codex-code-reviewer Bot May 6, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot May 6, 2026

Uh oh!

codex-code-reviewer Bot left a comment

Uh oh!

codex-code-reviewer Bot May 6, 2026

Uh oh!

codex-code-reviewer Bot May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

leehanchung commented May 6, 2026

Summary

Test plan

Stacking note

Uh oh!

codex-code-reviewer Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codex-code-reviewer Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Review summary

Deployment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

codex-code-reviewer Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codex-code-reviewer Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

codex-code-reviewer Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant