Skip to content

Commit 3a0f42f

Browse files
Yoojin-namclaude
andauthored
feat(self-review): add reviewer-team consistency category K (#32)
Adds scripts/check_reviewer_team_consistency.py and a new category K ("Reviewer-team consistency", SR/MA-only, fabrication-grade) to self-review. The check detects the conjunction of: - DUAL claim (Methods or PROSPERO: "two reviewers independently", "dual extractors", etc.), AND - SINGLE confession (Discussion/Limitations: "single primary reviewer", "20% sample", "deferred to before submission", "due to resource constraints"). Either side alone is fine. Both together is read as fabrication-grade and ends the round. Provides resolution guidance (honest Methods rewrite OR Limitations rewrite). Synthetic fixture tests cover clean / Methods+limits / PROSPERO+limits / single-only. All 4 pass. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent a492c69 commit 3a0f42f

3 files changed

Lines changed: 422 additions & 0 deletions

File tree

skills/self-review/SKILL.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,26 @@ before submission for item-level assessment."
163163
| Training vs fine-tuning | If pre-trained: was the model fine-tuned on study data? If vendor-provided: any access to training data composition? |
164164
| Proprietary limitations | For commercial AI or tools: are known limitations acknowledged? Can results be independently reproduced? |
165165

166+
#### K. Reviewer-team consistency (SR/MA-only; fabrication-grade)
167+
168+
| Check | What to look for |
169+
|-------|-----------------|
170+
| DUAL vs SINGLE conjunction **[CRITICAL]** | Methods or PROSPERO claims dual independent reviewers AND Discussion/Limitations admits single primary reviewer + 20% sample (or "deferred to before submission")? Mark as **MAJOR**, fabrication-grade. |
171+
172+
Run the deterministic check at Phase 2 entry:
173+
174+
```bash
175+
python "${CLAUDE_SKILL_DIR}/scripts/check_reviewer_team_consistency.py" \
176+
--manuscript manuscript.md \
177+
--prospero prospero/record.md \
178+
--out _audit_self/reviewer_team_consistency.md
179+
```
180+
181+
Exit 1 = MAJOR red flag. Either claim alone is fine; the conjunction is
182+
read by reviewers as fabrication. Resolution path:
183+
1. Honest Methods/PROSPERO update (single-reviewer execution disclosed), OR
184+
2. Limitations confession rewritten if dual review was actually completed.
185+
166186
### Research-Type Adaptation
167187

168188
Not all categories apply equally to every study type. Use this routing table:
@@ -179,6 +199,7 @@ Not all categories apply equally to every study type. Use this routing table:
179199
| H. Circularity | Full | Partial | N/A | N/A | N/A | Partial |
180200
| I. Protocol Heterogeneity | Full | Full | N/A | Per-study | N/A | Full |
181201
| J. Method Transparency | Full | Partial | Partial | N/A | N/A | Partial |
202+
| K. Reviewer-team consistency | N/A | N/A | N/A | Full | N/A | N/A |
182203

183204
*Meta-analysis: Replace C with heterogeneity assessment (I-squared, prediction intervals),
184205
publication bias (funnel plot, Egger), and sensitivity/subgroup analyses.
Lines changed: 304 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,304 @@
1+
#!/usr/bin/env python3
2+
"""
3+
check_reviewer_team_consistency.py — fabrication-grade self-review check.
4+
5+
Detects manuscripts that simultaneously claim dual independent reviewers
6+
(in Methods + PROSPERO) and confess to single-reviewer execution (in
7+
Discussion §Limitations). Either claim alone is fine; the conjunction is
8+
a fabrication-grade red flag.
9+
10+
Why this check exists
11+
=====================
12+
Cross-project precedent (anonymized): a reporting-quality SR-of-AI-tools
13+
manuscript had:
14+
- Methods: "Two reviewers independently screened titles and abstracts ..."
15+
- PROSPERO record: "Two independent reviewers will perform full-text
16+
screening and data extraction."
17+
- Discussion §Limitations: "Single primary reviewer; a 20% sample by an
18+
additional reviewer is deferred to before submission."
19+
20+
The conjunction admits in Limitations what Methods denies in narrative
21+
form. Reviewers and editors who notice this read it as fabrication-grade
22+
(the manuscript misrepresents what was actually done) and reject.
23+
24+
Detection strategy
25+
==================
26+
Section-aware grep for two regex families:
27+
- DUAL claim: "(independently|dual|two reviewers|both reviewers|two
28+
independent)" within Methods OR a PROSPERO record file.
29+
- SINGLE confession: "(single primary reviewer|one additional reviewer|
30+
20% sample|sample of records|deferred to before submission|due to
31+
resource constraints|by the first reviewer alone)" within Limitations
32+
OR Discussion.
33+
34+
Both present → MAJOR self-review red flag.
35+
36+
Usage
37+
=====
38+
39+
python check_reviewer_team_consistency.py \\
40+
--manuscript manuscript.md \\
41+
--prospero prospero/record.md \\
42+
--out _audit_self/reviewer_team_consistency.md
43+
44+
Output
45+
======
46+
Markdown report at `--out` summarizing matches. Also a JSON sidecar at
47+
`--out` + ".json".
48+
49+
Exit codes
50+
==========
51+
0 no conflict
52+
1 MAJOR red flag detected (both DUAL and SINGLE patterns present)
53+
2 invocation error
54+
"""
55+
56+
from __future__ import annotations
57+
58+
import argparse
59+
import json
60+
import re
61+
import sys
62+
from dataclasses import dataclass, field
63+
from pathlib import Path
64+
65+
66+
DUAL_PATTERNS = [
67+
(
68+
re.compile(r"\btwo\s+(?:independent\s+)?reviewers?\b", re.IGNORECASE),
69+
"two reviewers",
70+
),
71+
(re.compile(r"\bdual\s+(?:independent\s+)?(?:reviewers?|extractors?)\b", re.IGNORECASE),
72+
"dual reviewers"),
73+
(re.compile(r"\bindependent(?:ly)?\s+screened\b", re.IGNORECASE), "independently screened"),
74+
(
75+
re.compile(r"\bboth\s+reviewers?\s+(?:independently|extracted|screened)\b", re.IGNORECASE),
76+
"both reviewers",
77+
),
78+
(
79+
re.compile(r"\bindependent(?:ly)?\s+(?:extracted|coded|assessed)\b", re.IGNORECASE),
80+
"independent extraction",
81+
),
82+
(
83+
re.compile(r"\bindependent\s+reviewers?\b", re.IGNORECASE),
84+
"independent reviewers",
85+
),
86+
]
87+
88+
SINGLE_PATTERNS = [
89+
(re.compile(r"\bsingle\s+primary\s+reviewer\b", re.IGNORECASE), "single primary reviewer"),
90+
(
91+
re.compile(r"\bone\s+additional\s+reviewer\b", re.IGNORECASE),
92+
"one additional reviewer",
93+
),
94+
(
95+
re.compile(r"\b20\s*%?\s+sample\b", re.IGNORECASE),
96+
"20% sample",
97+
),
98+
(
99+
re.compile(r"\bsample\s+of\s+records\b", re.IGNORECASE),
100+
"sample of records",
101+
),
102+
(
103+
re.compile(r"\bdeferred\s+to\s+before\s+submission\b", re.IGNORECASE),
104+
"deferred to before submission",
105+
),
106+
(
107+
re.compile(r"\bdue\s+to\s+resource\s+constraints\b", re.IGNORECASE),
108+
"due to resource constraints",
109+
),
110+
(
111+
re.compile(r"\b(?:by|with)\s+(?:the\s+)?first\s+reviewer\s+(?:alone|only)\b", re.IGNORECASE),
112+
"first reviewer alone",
113+
),
114+
]
115+
116+
117+
SECTION_HEADERS = {
118+
"Methods": re.compile(
119+
r"^#{1,3}\s*\*{0,2}(?:METHODS?|Method[s]?|Materials and Methods)\*{0,2}\s*$",
120+
re.IGNORECASE | re.MULTILINE,
121+
),
122+
"Discussion": re.compile(
123+
r"^#{1,3}\s*\*{0,2}(?:DISCUSSION|Discussion)\*{0,2}\s*$",
124+
re.IGNORECASE | re.MULTILINE,
125+
),
126+
"Limitations": re.compile(
127+
r"^#{1,3}\s*\*{0,2}(?:LIMITATIONS?|Limitations?|Study Limitations)\*{0,2}\s*$",
128+
re.IGNORECASE | re.MULTILINE,
129+
),
130+
}
131+
132+
133+
@dataclass
134+
class Hit:
135+
pattern_label: str
136+
line: int
137+
context: str
138+
139+
140+
@dataclass
141+
class Report:
142+
submission_safe: bool
143+
dual_hits: list[dict] = field(default_factory=list)
144+
single_hits: list[dict] = field(default_factory=list)
145+
146+
147+
def split_sections(text: str) -> dict[str, str]:
148+
out: dict[str, str] = {name: "" for name in SECTION_HEADERS}
149+
headers: list[tuple[str, int, int]] = []
150+
for name, pat in SECTION_HEADERS.items():
151+
for m in pat.finditer(text):
152+
headers.append((name, m.start(), m.end()))
153+
headers.sort(key=lambda t: t[1])
154+
for i, (name, _, hdr_end) in enumerate(headers):
155+
end = headers[i + 1][1] if i + 1 < len(headers) else len(text)
156+
out[name] = (out[name] + "\n" + text[hdr_end:end]).strip()
157+
return out
158+
159+
160+
def scan_text(text: str, patterns: list[tuple[re.Pattern[str], str]]) -> list[Hit]:
161+
hits: list[Hit] = []
162+
lines = text.splitlines()
163+
for lineno, line in enumerate(lines, start=1):
164+
for pat, label in patterns:
165+
if pat.search(line):
166+
ctx = line.strip()
167+
if len(ctx) > 200:
168+
ctx = ctx[:200] + "..."
169+
hits.append(Hit(pattern_label=label, line=lineno, context=ctx))
170+
return hits
171+
172+
173+
def hit_to_dict(h: Hit, source: str) -> dict:
174+
return {
175+
"source": source,
176+
"pattern": h.pattern_label,
177+
"line": h.line,
178+
"context": h.context,
179+
}
180+
181+
182+
def build_report(manuscript: str, prospero: str | None) -> Report:
183+
sections = split_sections(manuscript)
184+
185+
dual_hits: list[dict] = []
186+
single_hits: list[dict] = []
187+
188+
# Methods → DUAL evidence.
189+
for h in scan_text(sections.get("Methods", ""), DUAL_PATTERNS):
190+
dual_hits.append(hit_to_dict(h, "manuscript:Methods"))
191+
192+
# PROSPERO → DUAL evidence.
193+
if prospero is not None:
194+
for h in scan_text(prospero, DUAL_PATTERNS):
195+
dual_hits.append(hit_to_dict(h, "prospero"))
196+
197+
# Discussion & Limitations → SINGLE evidence.
198+
for region in ("Limitations", "Discussion"):
199+
for h in scan_text(sections.get(region, ""), SINGLE_PATTERNS):
200+
single_hits.append(hit_to_dict(h, f"manuscript:{region}"))
201+
202+
submission_safe = not (dual_hits and single_hits)
203+
return Report(
204+
submission_safe=submission_safe,
205+
dual_hits=dual_hits,
206+
single_hits=single_hits,
207+
)
208+
209+
210+
def render_markdown(report: Report) -> str:
211+
lines = ["# Reviewer-team consistency audit", ""]
212+
if report.submission_safe:
213+
lines.append(
214+
"Status: **PASS** — no conjunction of DUAL claim + SINGLE confession."
215+
)
216+
lines.append("")
217+
if report.dual_hits:
218+
lines.append(f"DUAL claims found ({len(report.dual_hits)}, OK alone):")
219+
for h in report.dual_hits:
220+
lines.append(f"- `{h['source']}` line {h['line']}: `{h['pattern']}` — {h['context']}")
221+
if report.single_hits:
222+
lines.append("")
223+
lines.append(f"SINGLE confessions found ({len(report.single_hits)}, OK alone):")
224+
for h in report.single_hits:
225+
lines.append(f"- `{h['source']}` line {h['line']}: `{h['pattern']}` — {h['context']}")
226+
else:
227+
lines.append("Status: **MAJOR red flag** — DUAL claim and SINGLE confession both present.")
228+
lines.append("")
229+
lines.append("Reviewers will read this as fabrication-grade. Fix one of:")
230+
lines.append("1. Methods / PROSPERO honestly states single-reviewer execution.")
231+
lines.append("2. The Limitations admission is rewritten if dual review was actually done.")
232+
lines.append("")
233+
lines.append("## DUAL claims (Methods / PROSPERO)")
234+
for h in report.dual_hits:
235+
lines.append(f"- `{h['source']}` line {h['line']}: `{h['pattern']}`")
236+
lines.append(f" > {h['context']}")
237+
lines.append("")
238+
lines.append("## SINGLE confessions (Discussion / Limitations)")
239+
for h in report.single_hits:
240+
lines.append(f"- `{h['source']}` line {h['line']}: `{h['pattern']}`")
241+
lines.append(f" > {h['context']}")
242+
return "\n".join(lines) + "\n"
243+
244+
245+
def main(argv: list[str] | None = None) -> int:
246+
parser = argparse.ArgumentParser(
247+
description="Reviewer-team consistency check (fabrication-grade self-review)."
248+
)
249+
parser.add_argument("--manuscript", type=Path, required=True)
250+
parser.add_argument("--prospero", type=Path, default=None)
251+
parser.add_argument(
252+
"--out",
253+
type=Path,
254+
default=Path("_audit_self/reviewer_team_consistency.md"),
255+
)
256+
parser.add_argument("--quiet", action="store_true")
257+
args = parser.parse_args(argv)
258+
259+
if not args.manuscript.is_file():
260+
print(f"ERROR: manuscript not found: {args.manuscript}", file=sys.stderr)
261+
return 2
262+
prospero_text: str | None = None
263+
if args.prospero is not None:
264+
if not args.prospero.is_file():
265+
print(f"ERROR: prospero not found: {args.prospero}", file=sys.stderr)
266+
return 2
267+
prospero_text = args.prospero.read_text(encoding="utf-8")
268+
269+
text = args.manuscript.read_text(encoding="utf-8")
270+
report = build_report(text, prospero_text)
271+
272+
args.out.parent.mkdir(parents=True, exist_ok=True)
273+
args.out.write_text(render_markdown(report), encoding="utf-8")
274+
json_out = args.out.with_suffix(args.out.suffix + ".json")
275+
json_out.write_text(
276+
json.dumps(
277+
{
278+
"submission_safe": report.submission_safe,
279+
"dual_hits": report.dual_hits,
280+
"single_hits": report.single_hits,
281+
},
282+
indent=2,
283+
),
284+
encoding="utf-8",
285+
)
286+
287+
if not args.quiet:
288+
if report.submission_safe:
289+
print(
290+
f"PASS: no conjunction. DUAL={len(report.dual_hits)} "
291+
f"SINGLE={len(report.single_hits)}"
292+
)
293+
else:
294+
print(
295+
f"FAIL: MAJOR red flag. DUAL={len(report.dual_hits)} "
296+
f"SINGLE={len(report.single_hits)}"
297+
)
298+
print(f"See {args.out}")
299+
300+
return 0 if report.submission_safe else 1
301+
302+
303+
if __name__ == "__main__":
304+
sys.exit(main())

0 commit comments

Comments
 (0)