fix(code-review): resolve base helper from skill directory by davidalee · Pull Request #783 · EveryInc/compound-engineering-plugin

davidalee · 2026-05-06T01:34:08Z

Summary

ce-code-review is a skill, but its branch-mode and standalone-mode bash blocks invoked the merge-base helper as bash scripts/resolve-base.sh. The Bash tool runs from the user's project CWD — not the skill directory — so that path resolves against the reviewed repo, not the bundled skill. Two concrete failure modes:

Correctness (loud-fail): in any reviewed repo without a scripts/resolve-base.sh, bash exits non-zero, the existing || branch prints ERROR: resolve-base.sh failed, and the skill aborts. Branch and standalone modes are effectively unusable on the common case — users have to fall back to base:<ref> or PR mode.
Security (silent-corruption): if the reviewed repo does ship its own scripts/resolve-base.sh, the reviewer executes that repo-controlled script with no error and computes a plausible-looking but completely attacker-controlled merge-base. Low-likelihood, high-impact for a tool whose job is summarizing untrusted code.

This PR resolves the helper from ${CLAUDE_SKILL_DIR} instead, with an in-CWD-rejection check and an explicit fail-closed branch when the variable is unavailable or hostile.

Why this went unnoticed

PR mode dominates real usage. /ce-code-review <NNN> resolves the base from gh pr view and never touches resolve-base.sh, so most invocations don't hit the broken path.
The error read like a config problem. ERROR: resolve-base.sh failed looks like "set up your repo" rather than "the skill is misrouting." Users likely worked around it with base:<ref> and moved on.
The contract test only matched output strings, not that the path resolves through the harness rather than the CWD — so CI was happy. The updated test in this PR pins the probe shape and forbids the bare relative-path form.

What changed

Branch and standalone modes now resolve resolve-base.sh from ${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh. AGENTS.md documents ${CLAUDE_SKILL_DIR} as the canonical primitive for skill-bundled scripts and asserts it resolves correctly across claude --plugin-dir and standard marketplace-cached installs.
The candidate is realpath'd and rejected if it resolves to the reviewed repo's CWD or any path inside it — so even a misconfigured or hostile harness that points ${CLAUDE_SKILL_DIR} at a directory inside the reviewed repo cannot be used to execute a repo-controlled script.
If the variable is unexposed or resolves inside the reviewed repo, the skill prints a precise error and stops, telling the user to re-run with base:<ref> or use a harness that exposes the skill directory.
The pre-existing fail-closed posture is preserved: under any failure (missing variable, missing script, script-emitted ERROR:), the reviewer refuses to fall back to git diff HEAD, which would only show uncommitted changes and silently miss every committed change on the branch.
No bare relative-path fallback. Adding scripts/resolve-base.sh as a last-resort would re-introduce the original CWD-resolution bug and the script-shadowing security path, so the contract test now asserts that fallback is absent (expect(content).not.toMatch(/bash\s+scripts\/resolve-base\.sh/)).

Cross-platform behavior

This skill is converted for non-Claude targets (Codex, Cursor, Gemini, etc.) and the ${CLAUDE_SKILL_DIR} variable remains literal in the converted output. On those platforms the probe finds no script and the skill fails closed with a clear instruction to pass base:<ref> — strictly safer than executing a CWD-relative script. PR mode (<NNN>) is unaffected and continues to resolve the base from PR metadata via gh.

Empirical evidence

Reproduced in a throwaway repo with a decoy scripts/resolve-base.sh that prints BASE:DECOYWASEXECUTED and exits 0 — i.e., a worst-case reviewed repo carrying its own helper at the path the old code probed. Same CWD, same decoy, both invocations:

== OLD (bash scripts/resolve-base.sh) ==
$ bash scripts/resolve-base.sh
BASE:DECOYWASEXECUTED
exit=0

== NEW (probe block, with CLAUDE_SKILL_DIR set) ==
$ # probe resolves RESOLVE_SCRIPT to the bundled helper, ignores the decoy
exit=0
BASE=a9bafc43416f078e9841994b6112509d5049266b   # real merge-base from the bundled helper

The OLD path silently executed the reviewed repo's script and returned its forged base. The NEW path resolved through ${CLAUDE_SKILL_DIR}, ignored the decoy, and returned the real git merge-base sha. The loud-fail case (no decoy present in the reviewed repo) produces bash: scripts/resolve-base.sh: No such file or directory (exit 127) under OLD and the documented fail-closed error under NEW with no harness variable set.

Verification

bun test tests/review-skill-contract.test.ts — 26 pass, including the updated fails closed when merge-base is unresolved contract (pins the new probe shape, asserts absence of the relative-path fallback) and a new resolve-base probe resolves CLAUDE_SKILL_DIR and rejects repo-controlled paths test that exercises the probe end-to-end against synthetic harnesses (unset, empty, happy-path with CWD decoy present, and hostile harness pointing into the reviewed-repo CWD).
bun test — full suite (1240 tests) passes.
bun run release:validate — clean.

Test plan

Branch mode on a feature branch with ${CLAUDE_SKILL_DIR} set: helper resolves, diff is correct against merge-base.
Standalone mode on the current branch: same as above.
Reviewed repo containing a decoy scripts/resolve-base.sh: decoy is not executed; bundled helper runs.
Hostile harness with ${CLAUDE_SKILL_DIR} pointing inside the reviewed repo: rejected, fail-closed; decoy not executed.
Harness with no ${CLAUDE_SKILL_DIR} exposed: skill prints the documented error and stops; no git diff HEAD fallback.
PR mode (<NNN>): unchanged — still resolves base from PR metadata.

Try ${CLAUDE_SKILL_DIR} first, then ${CLAUDE_PLUGIN_ROOT}/skills/ce-code-review, before failing closed. Removes the implicit dependency on a single env var and keeps the security stance: no bare relative-path fallback that could pick up a scripts/resolve-base.sh from the reviewed repo's CWD.

Defense-in-depth for the harness-variable probe: even when CLAUDE_SKILL_DIR or CLAUDE_PLUGIN_ROOT is set, refuse to use a base that canonicalizes to or under the current working directory. Closes a residual channel where a hostile repo with direnv-style auto-loading could export CLAUDE_SKILL_DIR=$PWD/.evil and resurrect the same arbitrary-code-execution path the previous fix closed. Also fix a latent bug in the previous fail-closed message: the literal backticks in `base:<ref>` would have been command-substituted by bash. Use single-quoted printf instead. Add an execution-level contract test that extracts the probe snippet from SKILL.md and runs it across env-var combinations (both unset, each set alone, both set, skill dir inside cwd, empty-string skill dir, skill dir without script). The previous shape-matching test would have passed silently if probe order were swapped, the empty-string guard removed, or the bash invocation unquoted; this one would not. Tighten the prose around the threat: it was arbitrary code execution at review time, not just a corrupted merge-base.

…SKILL_DIR AGENTS.md documents ${CLAUDE_SKILL_DIR} as the canonical primitive for skill-bundled scripts and asserts it resolves correctly across both `claude --plugin-dir` and standard marketplace-cached installs. The earlier ${CLAUDE_PLUGIN_ROOT} fallback was defensive belt-and-suspenders with no concrete evidence of a real harness path where SKILL_DIR is missing but PLUGIN_ROOT is exposed -- only hypothetical robustness. Removing the second probe shrinks both the SKILL.md probe blocks (~13 lines each -> ~9 lines) and the contract test surface, while preserving every load-bearing security property: the in-CWD rejection still backstops a hostile harness pointing SKILL_DIR inside the reviewed repo, and the fail-closed posture still refuses any `git diff HEAD` fallback. Contract test now exercises four scenarios end-to-end: unset, empty, happy-path resolution despite a CWD decoy, and hostile harness pointing inside CWD. All four assert no fall-through to the reviewed-repo script.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 147cbd084a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T01:37:52Z

+  real_cwd=$(pwd -P)
+  case "$real_base" in
+    ""|"$real_cwd"|"$real_cwd"/*) ;;
+    *) RESOLVE_SCRIPT="$real_base/scripts/resolve-base.sh" ;;


Reject helper paths under repo root, not just current directory

This guard only checks whether real_base is equal to or nested under pwd -P, which is the caller’s current directory, not the reviewed repository root. If the review is launched from a subdirectory (common in monorepos), a harness value like <repo>/.evil is still repo-controlled but bypasses the "$real_cwd"/* pattern and is accepted, so bash "$RESOLVE_SCRIPT" can still execute attacker-controlled code from the reviewed repo. Compare against the repo top-level path (for example via git rev-parse --show-toplevel) or another trusted boundary instead of CWD.

Useful? React with 👍 / 👎.

Good catch — valid. Fixed in 76d6f92: the guard now compares real_base against git rev-parse --show-toplevel (with pwd -P fallback when the CWD is not in a git repo), so a launch from a monorepo subdirectory no longer leaves repo-controlled siblings like <repo-root>/.evil accepted. Added a contract test that git inits a fake monorepo, points ${CLAUDE_SKILL_DIR} at <root>/.evil from a subdir CWD, and asserts the probe fails closed and the decoy never executes.

The CWD-rejection guard for `${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh` compared `real_base` against `pwd -P`. In a monorepo where the reviewer is launched from a subdirectory, a harness value like `<repo-root>/.evil` sits outside the CWD subdir but is still repo-controlled and would be accepted, allowing arbitrary code execution from the reviewed repo. Compare against the repo top-level (`git rev-parse --show-toplevel`) instead, falling back to `pwd -P` only when not inside a git repo. Add a contract test that exercises the monorepo-subdirectory attack and pins the new probe shape.

davidalee added 4 commits May 5, 2026 19:06

fix(code-review): resolve base helper from skill directory

42132e0

davidalee changed the title ~~Fix/code review skill dir resolve base~~ fix(code-review): resolve base helper from skill directory May 6, 2026

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

davidalee mentioned this pull request May 6, 2026

fix(code-review): reject resolve-base.sh against repo top-level, not CWD davidalee/compound-engineering-plugin#12

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(code-review): resolve base helper from skill directory#783

fix(code-review): resolve base helper from skill directory#783
davidalee wants to merge 5 commits intoEveryInc:mainfrom
davidalee:fix/code-review-skill-dir-resolve-base

davidalee commented May 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Uh oh!

davidalee May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidalee commented May 6, 2026

Summary

Why this went unnoticed

What changed

Cross-platform behavior

Empirical evidence

Verification

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

davidalee May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant