Skip to content

v1.58.0.0 feat: /pr-prep — pre-PR upstream duplicate audit + /ship Step 1.5 gate#1696

Open
BenjaminDSmithy wants to merge 13 commits into
garrytan:mainfrom
BenjaminDSmithy:feat/pr-prep-skill
Open

v1.58.0.0 feat: /pr-prep — pre-PR upstream duplicate audit + /ship Step 1.5 gate#1696
BenjaminDSmithy wants to merge 13 commits into
garrytan:mainfrom
BenjaminDSmithy:feat/pr-prep-skill

Conversation

@BenjaminDSmithy

@BenjaminDSmithy BenjaminDSmithy commented May 25, 2026

Copy link
Copy Markdown

Summary

New /pr-prep skill: a pre-PR upstream duplicate audit. It walks git log base..HEAD, derives search keywords per commit from the subject + changed file
paths, queries the upstream repo's open issues and PRs via gh, and scores each
commit against open work (EXACT_DUP / OVERLAP / SIBLING / CLEAN). EXACT_DUP
refuses to proceed. /ship runs it as a Step 1.5 gate before any merge, test,
or version bump.

Why it helps you, the maintainer: it kills duplicate community PRs before
they're filed — the triage you do most. (Dogfood result below.)

What it does

  • Step 1.4 — reads the upstream CONTRIBUTING.md: pre-push gates, test
    layout, branch/commit conventions, banned patterns. Used inline when writing
    the PR body.
  • Step 4 — title-token + file-overlap Jaccard scoring, state-weighted. The
    bucketing is implemented in bin/gstack-pr-prep-score (pure, unit-tested) and
    referenced by the skill as the canonical scorer.
  • Step 4.4 — optional codex second opinion on CLEAN commits (different model
    family for independent signal; soft-skips if codex is absent).
  • Step 4.5 — per-commit pre-push gate + trigger-path + tests-added annotation.
  • Step 4.6 — commit-message style conformance against the upstream repo's
    convention (sampled from its CONTRIBUTING.md + git log), not the
    contributor's personal house style.
  • /ship Step 1.5 — invokes pr-prep with GSTACK_FROM_SHIP=1; EXACT_DUP aborts
    ship with resolution paths; skips cleanly on solo repos and --skip-pr-prep.

Dogfooded on itself

Ran /pr-prep on this branch against garrytan/gstack@main: CLEAN — 13/13
commits, no EXACT_DUP, no OVERLAP
, in ~30-60s of gh queries. No existing
pr-prep / duplicate-audit skill upstream; the open ship PRs all touch unrelated
parts of the flow. Full audit is in a comment on this PR.

Tests + release

  • test/pr-prep-score.test.ts — 13 deterministic cases pinning the scorer's
    bucket thresholds (gate-tier, free).
  • Registered in AGENTS.md, docs/skills.md, scripts/proactive-suggestions.json,
    and test/skill-coverage-matrix.ts.
  • Ship golden baselines refreshed (Claude / Codex / Factory); ALLOWED_SUBSTEPS
    updated for the new Step 1.5 sub-step.
  • v1.58.0.0 (MINOR) + CHANGELOG entry; VERSION / package.json synced.
  • bun test is green except environment-specific cases (local Swift toolchain,
    gbrain config, local git state) that also fail on a clean main.
  • Note: as a cross-repo fork PR, the eval/E2E CI jobs can't receive base-repo
    secrets, so those will error regardless of code; happy to push the branch to a
    base-repo branch and re-target if you'd prefer green eval CI.

Motivation

Real case (2026-05-26): a contributor branch on garrytan/gbrain had 8 commits
ready for upstream PRs. Without pr-prep, 4 of 8 unverified commits would have been
duplicates of already-open work:

Each would have cost a triage round and a close. pr-prep catches all of them in
one gh query pass before anything is filed.

@BenjaminDSmithy BenjaminDSmithy force-pushed the feat/pr-prep-skill branch 3 times, most recently from 8d43b46 to 2a5fa08 Compare June 10, 2026 11:57
@BenjaminDSmithy

Copy link
Copy Markdown
Author

Updated this branch.

Rebased onto main (v1.57.8.0) — clean, no conflicts.

Commit messages now follow the repo convention across every commit: type(scope): subject, plain-prose body, and the Co-Authored-By trailer this repo uses. Also dropped a "+ tests" subject that didn't match its diff.

New — Step 4.6 in /pr-prep: commit-message style conformance. The audit now checks each commit against the upstream repo's own convention (CONTRIBUTING.md commit rules → de-facto shape sampled from git log upstream/<base> → conventional-commit fallback) and flags personal-template bodies, missing or extra trailers, and subjects that promise content they don't contain. Soft warning, never blocks — style isn't a duplicate, but it's the cheapest reviewer-goodwill win in the audit.

Test + registration fixes so the branch is green:

  • Added 1.5 to ALLOWED_SUBSTEPS for the new Step 1.5 ship gate (skill-validation step-numbering).
  • Registered /pr-prep in AGENTS.md and docs/skills.md (doc-inventory cross-check).
  • Added /pr-prep to test/skill-coverage-matrix.ts with a gate-tier test.
  • Regenerated pr-prep/SKILL.md and scripts/proactive-suggestions.json against the current resolvers.
  • Refreshed the three ship golden baselines (test/fixtures/golden/{claude,codex,factory}-ship-SKILL.md) for the Step 1.5 block.

bun test passes on all PR-relevant checks. The only remaining failures are environment-specific (local Swift toolchain, gbrain config, local git state) and reproduce on a clean main.

@BenjaminDSmithy

Copy link
Copy Markdown
Author

Dogfooded /pr-prep on this PR

Ran the skill's own audit against garrytan/gstack@main (the thing it's built to do — so it seemed only fair to point it at itself).

Result: CLEAN — 11/11 commits, no EXACT_DUP, no OVERLAP.

  • No existing /pr-prep or pre-PR duplicate-audit skill upstream (closest hit, #1343 /plan-status, is an unrelated plan-progress checker).
  • The /ship Step 1.5 gate doesn't collide with the open ship PRs (#1716 SHIP-RECEIPT, #632 ship log, #684 resume-safe reruns, #1862 package-lock sync, #842 stack-aware tests, #338 review-satisfies-ship, #1944 REST updates) — all touch different parts of the ship flow; none add a pre-PR audit step.

So this contribution isn't duplicating open work, and the skill demonstrably runs end-to-end on a real branch.

@github-actions github-actions Bot changed the title feat: /pr-prep skill — pre-PR upstream duplicate audit + /ship Step 1.5 gate + codex review v1.58.0.0 feat: /pr-prep skill — pre-PR upstream duplicate audit + /ship Step 1.5 gate + codex review Jun 10, 2026
@BenjaminDSmithy BenjaminDSmithy changed the title v1.58.0.0 feat: /pr-prep skill — pre-PR upstream duplicate audit + /ship Step 1.5 gate + codex review v1.58.0.0 feat: /pr-prep — pre-PR upstream duplicate audit + /ship Step 1.5 gate Jun 10, 2026
BenjaminDSmithy and others added 13 commits June 11, 2026 00:44
Walks `git log base..HEAD`, derives search keywords per commit from
subject + changed file paths, queries upstream issues + PRs via `gh`,
scores each commit against upstream collisions (EXACT_DUP / OVERLAP /
SIBLING / CLEAN) on a title-token + file-overlap Jaccard, and refuses
to proceed when EXACT_DUP found. Designed to slot into `/ship` as a
Step 0 hook (env `GSTACK_FROM_SHIP=1` switches to JSON output + skips
interactive prompts).

Motivating case (real, 2026-05-26): a contributor branch had 8 commits
ready for upstream PRs; 4 of 4 unverified commits would have duplicated
already-open upstream issues. pr-prep catches all in ~30-60s of `gh`
queries, before any noise PR or reviewer triage round.

v0.1.0 ships inline bash in SKILL.md (reviewable in one file). Out of
scope: diff-content similarity, cross-repo audit, LLM-judged semantic
dup detection, auto-comment on upstream PRs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`/ship` now invokes `/pr-prep --base $BASE_BRANCH --json` with
`GSTACK_FROM_SHIP=1` before any of: merge base branch, run tests,
version bump, push. If pr-prep returns EXACT_DUP (exit 1), ship
aborts with a pinpoint message naming the upstream PR + resolution
paths (close mine / cherry-pick unique parts / coordinate + retry
with `--skip-pr-prep`).

Skip conditions: no upstream remote (solo-repo case), `--skip-pr-prep`
flag, or pr-prep skill not installed (older gstack — stderr warn +
continue). SIBLING / OVERLAP / CLEAN buckets do not block. The JSON
report is written to `/tmp/ship-pr-prep.json` so PR body assembly can
render upstream context as a collapsed section.

Fails fast before any test run wastes time on a branch that duplicates
already-open upstream work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… per commit

Step 1.4: fetch `CONTRIBUTING.md` (case-insensitive) via gh api from
the upstream repo, cache to /tmp, extract pre-push commands + test
layout conventions + branch naming rules + banned patterns. The agent
uses these inline when writing PR bodies.

Step 4.5: annotate each CLEAN/OVERLAP/SIBLING commit row with the
required pre-push gate (e.g. `bun run verify`), whether changed files
trigger special test paths (eval-replay for retrieval), and whether
the commit added tests. Soft warning on missing-tests when
not-required is unclear — don't block, let the human decide.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mmits

Before filing each CLEAN commit (not bucketed EXACT_DUP / OVERLAP /
SIBLING), invoke `codex review` for an independent second opinion.
Codex CLI uses a different model family (OpenAI vs Claude), so the
signal is genuinely independent and catches structural bugs the author
missed during write-up.

Optional: if `codex` is not on PATH, emit a soft warning and continue;
never block on tool availability. Severity escalation: P0/P1 findings
bump the commit from CLEAN to OVERLAP (don't file until addressed); P2
stays CLEAN at author discretion (fix-before-file or note in PR body).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The new `## Step 1.5` heading in `ship/SKILL.md.tmpl` (the pr-prep
gate) tripped the step-numbering checks in `skill-validation`, which
only permit a closed set of fractional sub-steps. Add `1.5` to
`ALLOWED_SUBSTEPS` so the gate heading is recognised as intentional.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The doc-inventory cross-check requires every skill directory to appear
in both `AGENTS.md` and `docs/skills.md`. Add a `/pr-prep` row to each
so the new skill is documented and the check passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The committed `pr-prep/SKILL.md` was generated against an older
preamble and had drifted from `bun run gen:skill-docs` output.
Regenerate it so it picks up the current shared resolvers: the
`SESSION_KIND` and `GSTACK_PLAN_MODE` preamble lines, the `/spec`
routing entry, the AskUserQuestion failure fallback, the 5+-option
split rule, and the Boil-the-Ocean rename.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adding the skill left its entry in `scripts/proactive-suggestions.json`
unwritten. Regenerate the registry so `/pr-prep` is wired into
proactive routing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add Step 4.6: a per-commit conformance check that holds the branch's
commit messages to the UPSTREAM repo's convention, not the
contributor's own house style. A fork PR whose commits read in a
different voice than the project reads as a drive-by and burns reviewer
goodwill before the diff is read.

The authoritative style is resolved once (Step 1.4): the upstream
CONTRIBUTING.md commit rules if present, else the de-facto shape sampled
from `git log upstream/$BASE_BRANCH --no-merges`, else the
conventional-commits baseline. Step 4.6 then flags subject-shape drift,
a personal body template (emoji/bullets) where upstream uses prose, a
subject that promises content it lacks (e.g. "+ tests" with no tests),
and a missing-or-extra trailer relative to upstream (e.g. a
`Co-Authored-By:` line upstream carries on every commit). Surfaced as a
soft warning in the report, never a block — style is not a duplicate,
but it is the cheapest goodwill win in the audit and far cheaper to fix
before the PR exists.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The golden-file regression check pins the generated per-host ship skill
(Claude, Codex, Factory) against committed snapshots. Adding the Step
1.5 pr-prep gate to `ship/SKILL.md.tmpl` changed all three generated
outputs, so the snapshots no longer matched. Regenerate them via
`gen:skill-docs --host all` and re-capture; the diff is exactly the
Step 1.5 block, identical across hosts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The coverage gate (skill-coverage-matrix + skill-coverage-floor)
requires every skill on disk to have a registry entry with at least one
gate-tier test. The new pr-prep skill had none, failing both checks.
Register it with the structural floor test as its gate-tier minimum,
matching how other audit/report skills (qa-only, investigate) are
covered until a behavioral E2E is written.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Step 4 bucketing (title/file Jaccard, state weighting, EXACT_DUP /
OVERLAP / SIBLING / CLEAN precedence) lived only as inline bash in the
skill, so it had no behavioral coverage. Extract it into a pure,
deterministic CLI, `bin/gstack-pr-prep-score`, and pin every bucket
threshold in `test/pr-prep-score.test.ts` (13 cases, free, gate-tier).
The skill's Step 4 now points at the scorer as the canonical
implementation rather than re-deriving the thresholds inline, and the
coverage matrix gates pr-prep on the new behavioral test. This is the
v0.2.0 extraction the skill flagged, scoped to the scoring core.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bump VERSION to 1.58.0.0 (MINOR: new skill) and add the release entry
for the /pr-prep pre-PR upstream duplicate audit and its /ship Step 1.5
gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant