v1.58.0.0 feat: /pr-prep — pre-PR upstream duplicate audit + /ship Step 1.5 gate#1696
v1.58.0.0 feat: /pr-prep — pre-PR upstream duplicate audit + /ship Step 1.5 gate#1696BenjaminDSmithy wants to merge 13 commits into
Conversation
8d43b46 to
2a5fa08
Compare
|
Updated this branch. Rebased onto Commit messages now follow the repo convention across every commit: New — Step 4.6 in Test + registration fixes so the branch is green:
|
Dogfooded
|
Walks `git log base..HEAD`, derives search keywords per commit from subject + changed file paths, queries upstream issues + PRs via `gh`, scores each commit against upstream collisions (EXACT_DUP / OVERLAP / SIBLING / CLEAN) on a title-token + file-overlap Jaccard, and refuses to proceed when EXACT_DUP found. Designed to slot into `/ship` as a Step 0 hook (env `GSTACK_FROM_SHIP=1` switches to JSON output + skips interactive prompts). Motivating case (real, 2026-05-26): a contributor branch had 8 commits ready for upstream PRs; 4 of 4 unverified commits would have duplicated already-open upstream issues. pr-prep catches all in ~30-60s of `gh` queries, before any noise PR or reviewer triage round. v0.1.0 ships inline bash in SKILL.md (reviewable in one file). Out of scope: diff-content similarity, cross-repo audit, LLM-judged semantic dup detection, auto-comment on upstream PRs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`/ship` now invokes `/pr-prep --base $BASE_BRANCH --json` with `GSTACK_FROM_SHIP=1` before any of: merge base branch, run tests, version bump, push. If pr-prep returns EXACT_DUP (exit 1), ship aborts with a pinpoint message naming the upstream PR + resolution paths (close mine / cherry-pick unique parts / coordinate + retry with `--skip-pr-prep`). Skip conditions: no upstream remote (solo-repo case), `--skip-pr-prep` flag, or pr-prep skill not installed (older gstack — stderr warn + continue). SIBLING / OVERLAP / CLEAN buckets do not block. The JSON report is written to `/tmp/ship-pr-prep.json` so PR body assembly can render upstream context as a collapsed section. Fails fast before any test run wastes time on a branch that duplicates already-open upstream work. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… per commit Step 1.4: fetch `CONTRIBUTING.md` (case-insensitive) via gh api from the upstream repo, cache to /tmp, extract pre-push commands + test layout conventions + branch naming rules + banned patterns. The agent uses these inline when writing PR bodies. Step 4.5: annotate each CLEAN/OVERLAP/SIBLING commit row with the required pre-push gate (e.g. `bun run verify`), whether changed files trigger special test paths (eval-replay for retrieval), and whether the commit added tests. Soft warning on missing-tests when not-required is unclear — don't block, let the human decide. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mmits Before filing each CLEAN commit (not bucketed EXACT_DUP / OVERLAP / SIBLING), invoke `codex review` for an independent second opinion. Codex CLI uses a different model family (OpenAI vs Claude), so the signal is genuinely independent and catches structural bugs the author missed during write-up. Optional: if `codex` is not on PATH, emit a soft warning and continue; never block on tool availability. Severity escalation: P0/P1 findings bump the commit from CLEAN to OVERLAP (don't file until addressed); P2 stays CLEAN at author discretion (fix-before-file or note in PR body). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The new `## Step 1.5` heading in `ship/SKILL.md.tmpl` (the pr-prep gate) tripped the step-numbering checks in `skill-validation`, which only permit a closed set of fractional sub-steps. Add `1.5` to `ALLOWED_SUBSTEPS` so the gate heading is recognised as intentional. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The doc-inventory cross-check requires every skill directory to appear in both `AGENTS.md` and `docs/skills.md`. Add a `/pr-prep` row to each so the new skill is documented and the check passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The committed `pr-prep/SKILL.md` was generated against an older preamble and had drifted from `bun run gen:skill-docs` output. Regenerate it so it picks up the current shared resolvers: the `SESSION_KIND` and `GSTACK_PLAN_MODE` preamble lines, the `/spec` routing entry, the AskUserQuestion failure fallback, the 5+-option split rule, and the Boil-the-Ocean rename. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adding the skill left its entry in `scripts/proactive-suggestions.json` unwritten. Regenerate the registry so `/pr-prep` is wired into proactive routing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add Step 4.6: a per-commit conformance check that holds the branch's commit messages to the UPSTREAM repo's convention, not the contributor's own house style. A fork PR whose commits read in a different voice than the project reads as a drive-by and burns reviewer goodwill before the diff is read. The authoritative style is resolved once (Step 1.4): the upstream CONTRIBUTING.md commit rules if present, else the de-facto shape sampled from `git log upstream/$BASE_BRANCH --no-merges`, else the conventional-commits baseline. Step 4.6 then flags subject-shape drift, a personal body template (emoji/bullets) where upstream uses prose, a subject that promises content it lacks (e.g. "+ tests" with no tests), and a missing-or-extra trailer relative to upstream (e.g. a `Co-Authored-By:` line upstream carries on every commit). Surfaced as a soft warning in the report, never a block — style is not a duplicate, but it is the cheapest goodwill win in the audit and far cheaper to fix before the PR exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The golden-file regression check pins the generated per-host ship skill (Claude, Codex, Factory) against committed snapshots. Adding the Step 1.5 pr-prep gate to `ship/SKILL.md.tmpl` changed all three generated outputs, so the snapshots no longer matched. Regenerate them via `gen:skill-docs --host all` and re-capture; the diff is exactly the Step 1.5 block, identical across hosts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The coverage gate (skill-coverage-matrix + skill-coverage-floor) requires every skill on disk to have a registry entry with at least one gate-tier test. The new pr-prep skill had none, failing both checks. Register it with the structural floor test as its gate-tier minimum, matching how other audit/report skills (qa-only, investigate) are covered until a behavioral E2E is written. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Step 4 bucketing (title/file Jaccard, state weighting, EXACT_DUP / OVERLAP / SIBLING / CLEAN precedence) lived only as inline bash in the skill, so it had no behavioral coverage. Extract it into a pure, deterministic CLI, `bin/gstack-pr-prep-score`, and pin every bucket threshold in `test/pr-prep-score.test.ts` (13 cases, free, gate-tier). The skill's Step 4 now points at the scorer as the canonical implementation rather than re-deriving the thresholds inline, and the coverage matrix gates pr-prep on the new behavioral test. This is the v0.2.0 extraction the skill flagged, scoped to the scoring core. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bump VERSION to 1.58.0.0 (MINOR: new skill) and add the release entry for the /pr-prep pre-PR upstream duplicate audit and its /ship Step 1.5 gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
908509a to
f3a7e53
Compare
Summary
New
/pr-prepskill: a pre-PR upstream duplicate audit. It walksgit log base..HEAD, derives search keywords per commit from the subject + changed filepaths, queries the upstream repo's open issues and PRs via
gh, and scores eachcommit against open work (EXACT_DUP / OVERLAP / SIBLING / CLEAN). EXACT_DUP
refuses to proceed.
/shipruns it as a Step 1.5 gate before any merge, test,or version bump.
Why it helps you, the maintainer: it kills duplicate community PRs before
they're filed — the triage you do most. (Dogfood result below.)
What it does
CONTRIBUTING.md: pre-push gates, testlayout, branch/commit conventions, banned patterns. Used inline when writing
the PR body.
bucketing is implemented in
bin/gstack-pr-prep-score(pure, unit-tested) andreferenced by the skill as the canonical scorer.
codexsecond opinion on CLEAN commits (different modelfamily for independent signal; soft-skips if
codexis absent).convention (sampled from its
CONTRIBUTING.md+git log), not thecontributor's personal house style.
GSTACK_FROM_SHIP=1; EXACT_DUP abortsship with resolution paths; skips cleanly on solo repos and
--skip-pr-prep.Dogfooded on itself
Ran
/pr-prepon this branch againstgarrytan/gstack@main: CLEAN — 13/13commits, no EXACT_DUP, no OVERLAP, in ~30-60s of
ghqueries. No existingpr-prep / duplicate-audit skill upstream; the open
shipPRs all touch unrelatedparts of the flow. Full audit is in a comment on this PR.
Tests + release
test/pr-prep-score.test.ts— 13 deterministic cases pinning the scorer'sbucket thresholds (gate-tier, free).
AGENTS.md,docs/skills.md,scripts/proactive-suggestions.json,and
test/skill-coverage-matrix.ts.ALLOWED_SUBSTEPSupdated for the new Step 1.5 sub-step.
VERSION/package.jsonsynced.bun testis green except environment-specific cases (local Swift toolchain,gbrain config, local git state) that also fail on a clean
main.secrets, so those will error regardless of code; happy to push the branch to a
base-repo branch and re-target if you'd prefer green eval CI.
Motivation
Real case (2026-05-26): a contributor branch on
garrytan/gbrainhad 8 commitsready for upstream PRs. Without pr-prep, 4 of 8 unverified commits would have been
duplicates of already-open work:
e96332c5(reindex CLI_ONLY one-char fix) → setup: add GSTACK_SKIP_PLAYWRIGHT to skip Chromium install #913, open 14 days, same fix74819cec(sourceId fallback) → fix: make bin root detection CDPATH-safe #836, open787da2af+829099f9(synopsis env-override) → setup --host cursor errors out despite README + hosts/cursor.ts declaring full Cursor support #1358, open, same patterne0133d8a(LM Studio recipe + Ollama dims_options) → fix: normalize line endings in template reads for cross-platform generator #1051 + Multiple Claude Code security hook triggers in/codexand/autoplanskill templates #1329, crowded2db6b6d4(--lock-durationflag) → plan-eng-review: add throughput planning readout #1014 issue + Fix gstack build when Conductor creates a repo with unborn HEAD #1177 PREach would have cost a triage round and a close. pr-prep catches all of them in
one
ghquery pass before anything is filed.