Skip to content

feat(self-review): supplement/multi-file hygiene detector + promised-stat gate (detectors 30→31)#187

Merged
Yoojin-nam merged 3 commits into
mainfrom
feat/supplement-hygiene-detector
Jun 23, 2026
Merged

feat(self-review): supplement/multi-file hygiene detector + promised-stat gate (detectors 30→31)#187
Yoojin-nam merged 3 commits into
mainfrom
feat/supplement-hygiene-detector

Conversation

@Yoojin-nam

Copy link
Copy Markdown
Contributor

What

Promotes the review-harvest cluster "supplement leaks are technical-check-fatal but the gates only lint manuscript.md." Adds one detector, extends another, and documents the cross-artifact re-run.

New detector — check_supplement_hygiene.py (#9 / #12 / #6 / #8)

Lints a list of rendered reader-facing artifacts (--supplement, repeatable: supplement, separately-built tables, captions). All verdicts Major:

Verdict Catches
SUPP_INTERNAL_LABEL § / §L internal section / SAP labels
SUPP_PLACEHOLDER Table SX, S-N, [Authors], figure-path glob figS_*.{png,pdf}, build-dir path 1_Search/
SUPP_BUILD_MARKER [VERIFY…], TODO, FIXME, Remove this line if…
SUPP_RESPONSE_FRAMING Per Section Editor #2, Response to Reviewers, Reviewer 2 Comment 4
SUPP_PLANNING_RESIDUE Designed by:, Expected … Numbers, to be executed
SUPP_XREF_UNRESOLVED (with --manuscript) a body Supplementary Table/Figure N callout with no matching supplement section — renumber drift / silently-skipped section

Extend — check_artifact_coverage.py (#11)

New PROMISED_STAT_NO_VALUE + a --supplement corpus arg. Conservatively fires when a bound/ceiling/de-confounded statistic (AUC, c-statistic, sensitivity…) is promised with a reporting verb but never given a numeric value anywhere in the manuscript or supplement — the "described but never quantified" reviewer catch. A reported value suppresses it (tested).

Cross-artifact drift (#25)

Already covered by sync-submission/check_cross_artifact_stale.py (main↔supplement↔figure-source labeled-value drift). The gap was wiring — documented in Phase 2.5f to re-run it after any audit/reframe, not only once.

Catalog (30 → 31)

check_supplement_hygiene registered under reporting_compliance; detectors_catalog.json regenerated (reporting 7→8), catalog_counts.json + MEDSCI_AUDIT.md bumped. validate_catalog_consistency.py green. No skill-count change → skills/marketplace catalogs untouched. Additive → no release.

Tests / CI

  • test_supplement_hygiene.sh: dirty → all 6 verdicts; clean → 0 (no FP); xref resolves Table 2, flags Figure 9; usage error on no --supplement.
  • test_artifact_coverage.sh extended: promised-stat positive fires, value-present negative suppresses.
  • Both wired into validate.yml. validate_skills.sh, gen_detectors_catalog_json.py --check, validate_catalog_consistency.py, check_locale_inventory.py, skills/marketplace/detectors catalog tests all pass.

🤖 Generated with Claude Code

Yoojin-nam and others added 3 commits June 24, 2026 08:17
…stat gate (detectors 30→31)

Promotes the review-harvest "supplement leaks are technical-check-fatal but never
linted" cluster. Existing gates lint manuscript.md only; the rendered supplement,
a separately-built tables file, and caption files reach reviewers unlinted.

New detector — skills/self-review/scripts/check_supplement_hygiene.py:
lints a LIST of rendered reader-facing artifacts (--supplement, repeatable) for
  SUPP_INTERNAL_LABEL    § / §L internal section or SAP labels
  SUPP_PLACEHOLDER       Table SX / S-N / [Authors] / figure-glob / build-dir path
  SUPP_BUILD_MARKER      [VERIFY] / TODO / FIXME / "Remove this line if"
  SUPP_RESPONSE_FRAMING  "Per Section Editor #2" / "Response to Reviewers"
  SUPP_PLANNING_RESIDUE  "Designed by:" / "Expected … Numbers" / "to be executed"
and, with --manuscript, SUPP_XREF_UNRESOLVED (a body "Supplementary Table/Figure N"
callout with no matching supplement section — renumber drift / silently-skipped
section). All Major.

Extend — check_artifact_coverage.py: new PROMISED_STAT_NO_VALUE verdict + a
--supplement corpus arg. Fires (conservatively) when a bound/ceiling/de-confounded
statistic (AUC, c-statistic, sensitivity…) is promised with a reporting verb but
never given a numeric value anywhere in the manuscript or supplement.

Cross-artifact (#25): the main↔supplement↔figure-source drift class is already
covered by sync-submission's check_cross_artifact_stale.py; documented in Phase
2.5f to re-run it after any audit/reframe (the gap was wiring, not a missing tool).

Catalog: register check_supplement_hygiene under reporting_compliance, regenerate
metadata/detectors_catalog.json (30→31, reporting 7→8), bump
metadata/catalog_counts.json + MEDSCI_AUDIT.md; validate_catalog_consistency passes.

Tests: new test_supplement_hygiene.sh (dirty→all 6 verdicts, clean→0, xref resolve,
usage error); test_artifact_coverage.sh extended with promised-stat positive/negative
(value-present suppresses). Both wired into validate.yml. No skill count change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…detector

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Yoojin-nam Yoojin-nam force-pushed the feat/supplement-hygiene-detector branch from 05e15b7 to 08cecc1 Compare June 23, 2026 23:30
@Yoojin-nam Yoojin-nam merged commit f02d54c into main Jun 23, 2026
3 checks passed
@Yoojin-nam Yoojin-nam deleted the feat/supplement-hygiene-detector branch June 23, 2026 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant