Skip to content

fix(code-review): resolve base helper from skill directory#783

Open
davidalee wants to merge 5 commits intoEveryInc:mainfrom
davidalee:fix/code-review-skill-dir-resolve-base
Open

fix(code-review): resolve base helper from skill directory#783
davidalee wants to merge 5 commits intoEveryInc:mainfrom
davidalee:fix/code-review-skill-dir-resolve-base

Conversation

@davidalee
Copy link
Copy Markdown
Contributor

Summary

ce-code-review is a skill, but its branch-mode and standalone-mode bash blocks invoked the merge-base helper as bash scripts/resolve-base.sh. The Bash tool runs from the user's project CWD — not the skill directory — so that path resolves against the reviewed repo, not the bundled skill. Two concrete failure modes:

  1. Correctness (loud-fail): in any reviewed repo without a scripts/resolve-base.sh, bash exits non-zero, the existing || branch prints ERROR: resolve-base.sh failed, and the skill aborts. Branch and standalone modes are effectively unusable on the common case — users have to fall back to base:<ref> or PR mode.
  2. Security (silent-corruption): if the reviewed repo does ship its own scripts/resolve-base.sh, the reviewer executes that repo-controlled script with no error and computes a plausible-looking but completely attacker-controlled merge-base. Low-likelihood, high-impact for a tool whose job is summarizing untrusted code.

This PR resolves the helper from ${CLAUDE_SKILL_DIR} instead, with an in-CWD-rejection check and an explicit fail-closed branch when the variable is unavailable or hostile.

Why this went unnoticed

  • PR mode dominates real usage. /ce-code-review <NNN> resolves the base from gh pr view and never touches resolve-base.sh, so most invocations don't hit the broken path.
  • The error read like a config problem. ERROR: resolve-base.sh failed looks like "set up your repo" rather than "the skill is misrouting." Users likely worked around it with base:<ref> and moved on.
  • The contract test only matched output strings, not that the path resolves through the harness rather than the CWD — so CI was happy. The updated test in this PR pins the probe shape and forbids the bare relative-path form.

What changed

  • Branch and standalone modes now resolve resolve-base.sh from ${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh. AGENTS.md documents ${CLAUDE_SKILL_DIR} as the canonical primitive for skill-bundled scripts and asserts it resolves correctly across claude --plugin-dir and standard marketplace-cached installs.
  • The candidate is realpath'd and rejected if it resolves to the reviewed repo's CWD or any path inside it — so even a misconfigured or hostile harness that points ${CLAUDE_SKILL_DIR} at a directory inside the reviewed repo cannot be used to execute a repo-controlled script.
  • If the variable is unexposed or resolves inside the reviewed repo, the skill prints a precise error and stops, telling the user to re-run with base:<ref> or use a harness that exposes the skill directory.
  • The pre-existing fail-closed posture is preserved: under any failure (missing variable, missing script, script-emitted ERROR:), the reviewer refuses to fall back to git diff HEAD, which would only show uncommitted changes and silently miss every committed change on the branch.
  • No bare relative-path fallback. Adding scripts/resolve-base.sh as a last-resort would re-introduce the original CWD-resolution bug and the script-shadowing security path, so the contract test now asserts that fallback is absent (expect(content).not.toMatch(/bash\s+scripts\/resolve-base\.sh/)).

Cross-platform behavior

This skill is converted for non-Claude targets (Codex, Cursor, Gemini, etc.) and the ${CLAUDE_SKILL_DIR} variable remains literal in the converted output. On those platforms the probe finds no script and the skill fails closed with a clear instruction to pass base:<ref> — strictly safer than executing a CWD-relative script. PR mode (<NNN>) is unaffected and continues to resolve the base from PR metadata via gh.

Empirical evidence

Reproduced in a throwaway repo with a decoy scripts/resolve-base.sh that prints BASE:DECOYWASEXECUTED and exits 0 — i.e., a worst-case reviewed repo carrying its own helper at the path the old code probed. Same CWD, same decoy, both invocations:

== OLD (bash scripts/resolve-base.sh) ==
$ bash scripts/resolve-base.sh
BASE:DECOYWASEXECUTED
exit=0

== NEW (probe block, with CLAUDE_SKILL_DIR set) ==
$ # probe resolves RESOLVE_SCRIPT to the bundled helper, ignores the decoy
exit=0
BASE=a9bafc43416f078e9841994b6112509d5049266b   # real merge-base from the bundled helper

The OLD path silently executed the reviewed repo's script and returned its forged base. The NEW path resolved through ${CLAUDE_SKILL_DIR}, ignored the decoy, and returned the real git merge-base sha. The loud-fail case (no decoy present in the reviewed repo) produces bash: scripts/resolve-base.sh: No such file or directory (exit 127) under OLD and the documented fail-closed error under NEW with no harness variable set.

Verification

  • bun test tests/review-skill-contract.test.ts — 26 pass, including the updated fails closed when merge-base is unresolved contract (pins the new probe shape, asserts absence of the relative-path fallback) and a new resolve-base probe resolves CLAUDE_SKILL_DIR and rejects repo-controlled paths test that exercises the probe end-to-end against synthetic harnesses (unset, empty, happy-path with CWD decoy present, and hostile harness pointing into the reviewed-repo CWD).
  • bun test — full suite (1240 tests) passes.
  • bun run release:validate — clean.

Test plan

  • Branch mode on a feature branch with ${CLAUDE_SKILL_DIR} set: helper resolves, diff is correct against merge-base.
  • Standalone mode on the current branch: same as above.
  • Reviewed repo containing a decoy scripts/resolve-base.sh: decoy is not executed; bundled helper runs.
  • Hostile harness with ${CLAUDE_SKILL_DIR} pointing inside the reviewed repo: rejected, fail-closed; decoy not executed.
  • Harness with no ${CLAUDE_SKILL_DIR} exposed: skill prints the documented error and stops; no git diff HEAD fallback.
  • PR mode (<NNN>): unchanged — still resolves base from PR metadata.

davidalee added 4 commits May 5, 2026 19:06
Try ${CLAUDE_SKILL_DIR} first, then ${CLAUDE_PLUGIN_ROOT}/skills/ce-code-review,
before failing closed. Removes the implicit dependency on a single env var and
keeps the security stance: no bare relative-path fallback that could pick up a
scripts/resolve-base.sh from the reviewed repo's CWD.
Defense-in-depth for the harness-variable probe: even when CLAUDE_SKILL_DIR
or CLAUDE_PLUGIN_ROOT is set, refuse to use a base that canonicalizes to or
under the current working directory. Closes a residual channel where a
hostile repo with direnv-style auto-loading could export
CLAUDE_SKILL_DIR=$PWD/.evil and resurrect the same arbitrary-code-execution
path the previous fix closed.

Also fix a latent bug in the previous fail-closed message: the literal
backticks in `base:<ref>` would have been command-substituted by bash. Use
single-quoted printf instead.

Add an execution-level contract test that extracts the probe snippet from
SKILL.md and runs it across env-var combinations (both unset, each set
alone, both set, skill dir inside cwd, empty-string skill dir, skill dir
without script). The previous shape-matching test would have passed
silently if probe order were swapped, the empty-string guard removed, or
the bash invocation unquoted; this one would not.

Tighten the prose around the threat: it was arbitrary code execution at
review time, not just a corrupted merge-base.
…SKILL_DIR

AGENTS.md documents ${CLAUDE_SKILL_DIR} as the canonical primitive for
skill-bundled scripts and asserts it resolves correctly across both
`claude --plugin-dir` and standard marketplace-cached installs. The earlier
${CLAUDE_PLUGIN_ROOT} fallback was defensive belt-and-suspenders with no
concrete evidence of a real harness path where SKILL_DIR is missing but
PLUGIN_ROOT is exposed -- only hypothetical robustness.

Removing the second probe shrinks both the SKILL.md probe blocks (~13 lines
each -> ~9 lines) and the contract test surface, while preserving every
load-bearing security property: the in-CWD rejection still backstops a
hostile harness pointing SKILL_DIR inside the reviewed repo, and the
fail-closed posture still refuses any `git diff HEAD` fallback.

Contract test now exercises four scenarios end-to-end: unset, empty,
happy-path resolution despite a CWD decoy, and hostile harness pointing
inside CWD. All four assert no fall-through to the reviewed-repo script.
@davidalee davidalee changed the title Fix/code review skill dir resolve base fix(code-review): resolve base helper from skill directory May 6, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 147cbd084a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +301 to +304
real_cwd=$(pwd -P)
case "$real_base" in
""|"$real_cwd"|"$real_cwd"/*) ;;
*) RESOLVE_SCRIPT="$real_base/scripts/resolve-base.sh" ;;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject helper paths under repo root, not just current directory

This guard only checks whether real_base is equal to or nested under pwd -P, which is the caller’s current directory, not the reviewed repository root. If the review is launched from a subdirectory (common in monorepos), a harness value like <repo>/.evil is still repo-controlled but bypasses the "$real_cwd"/* pattern and is accepted, so bash "$RESOLVE_SCRIPT" can still execute attacker-controlled code from the reviewed repo. Compare against the repo top-level path (for example via git rev-parse --show-toplevel) or another trusted boundary instead of CWD.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — valid. Fixed in 76d6f92: the guard now compares real_base against git rev-parse --show-toplevel (with pwd -P fallback when the CWD is not in a git repo), so a launch from a monorepo subdirectory no longer leaves repo-controlled siblings like <repo-root>/.evil accepted. Added a contract test that git inits a fake monorepo, points ${CLAUDE_SKILL_DIR} at <root>/.evil from a subdir CWD, and asserts the probe fails closed and the decoy never executes.

The CWD-rejection guard for `${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh`
compared `real_base` against `pwd -P`. In a monorepo where the reviewer is
launched from a subdirectory, a harness value like `<repo-root>/.evil`
sits outside the CWD subdir but is still repo-controlled and would be
accepted, allowing arbitrary code execution from the reviewed repo.

Compare against the repo top-level (`git rev-parse --show-toplevel`)
instead, falling back to `pwd -P` only when not inside a git repo. Add a
contract test that exercises the monorepo-subdirectory attack and pins
the new probe shape.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant