Skip to content

Commit 0819efe

Browse files
Yoojin-namclaude
andauthored
feat(meta-analysis): add Phase 3f.5 FINAL_POOL_LOCK.yaml screening freeze (#30)
Adds Phase 3f.5 ("Pool composition lock") and ships a template at templates/FINAL_POOL_LOCK.yaml.template. Once round-3 adjudication freezes, the lock becomes the single source of truth for include_count, exclude_count, mixed_count, and the canonical UID list. SHA-256 hash provides tamper-evidence. Downstream artifacts (extraction TSV, manuscript prose, PRISMA flow caption, supplementary INDEX, cover letter) reference the lock instead of re-deriving counts. Companions: - PR T1-1 sync-submission Phase 5 --pool-lock - PR T1-6 meta-analysis Phase 4 entry gate Motivation: cross-project precedent of 5-document INCLUDE/EXCLUDE drift caused by a late adjudication that propagated unevenly. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent b855b48 commit 0819efe

2 files changed

Lines changed: 115 additions & 0 deletions

File tree

skills/meta-analysis/SKILL.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,51 @@ ID sets. The Markdown consensus document remains the human explanation.
216216

217217
**Precedent incident (a PRISMA-DTA meta-analysis revision):** a late-revision manuscript shipped with k_qualitative = 32 / k_narrative-only = 10 / k_FT-excluded = 46. ID-set reconciliation (performed only after an adversarial audit at post-Stage 4 QC) revealed true counts 24/2/54. An early-draft prose total ("30 → 32 after FLAG consensus") had been carried forward without ever being reconciled against the screening TSV intersected with the consensus spreadsheet; four downstream artifacts echoed the same wrong total. This gate would have caught the drift at the Phase 5 hand-off.
218218

219+
#### 3f.5 Pool composition lock (MANDATORY at adjudication freeze)
220+
221+
After Phase 3f reconciliation passes, freeze the pool composition into a
222+
single source-of-truth YAML so every downstream artifact (extraction TSV,
223+
manuscript prose counts, PRISMA flow caption, supplementary INDEX, cover
224+
letter free-text) can be checked against it.
225+
226+
Why this lock exists
227+
^^^^^^^^^^^^^^^^^^^^
228+
229+
Cross-project precedent (anonymized): an LLM reporting-quality SR carried
230+
five documents that disagreed on INCLUDE (63 vs 64) and EXCLUDE
231+
(108/109/111). Three EXCLUDE rows existed in the extraction sheet without
232+
matching INCLUDE. The drift traced to a late round-3 adjudication whose
233+
result was applied to some artifacts and not others — there was no single
234+
canonical post-freeze count to reference.
235+
236+
How to lock
237+
^^^^^^^^^^^
238+
239+
1. Copy the template:
240+
```bash
241+
cp "${CLAUDE_SKILL_DIR}/templates/FINAL_POOL_LOCK.yaml.template" \
242+
2_Data/FINAL_POOL_LOCK.yaml
243+
```
244+
2. Fill in counts and UID lists from the reconciliation in Phase 3f.
245+
3. Compute the SHA-256 integrity hash from the sorted UID list.
246+
4. Commit the lock to git BEFORE starting Phase 4 extraction.
247+
248+
Downstream gates
249+
^^^^^^^^^^^^^^^^
250+
251+
- `/meta-analysis` Phase 4 entry: extraction TSV's UID set MUST equal
252+
`include_uids``mixed_uids` from the lock. See Phase 4 entry gate.
253+
- `/sync-submission` Phase 5
254+
(`scripts/cross_document_n_check.py --pool-lock`): every numeric claim
255+
in manuscript / abstract / supplementary that maps to a locked
256+
category must match the locked value.
257+
- Manuscript prose: NEVER re-derive `k included` from extraction TSV at
258+
manuscript build time. Always reference `final_pool_n` from the lock.
259+
260+
If a late post-freeze decision changes the pool, treat it as a formal
261+
PROSPERO amendment: file the amendment, re-freeze the lock as a new
262+
file (`FINAL_POOL_LOCK_v2.yaml`), and propagate to every artifact.
263+
219264
### Phase 4: Data Extraction
220265

221266
**Goal**: Create standardized extraction forms and extract 2x2 or effect size data.
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# FINAL_POOL_LOCK.yaml — frozen pool composition for an SR/MA
2+
#
3+
# Created at Phase 3f.5 (round-3 adjudication freeze) by /meta-analysis.
4+
# All downstream artifacts (extraction TSV, manuscript prose counts,
5+
# PRISMA flow caption, supplementary INDEX, cover letter free-text)
6+
# must agree with these values exactly. /sync-submission Phase 5
7+
# `scripts/cross_document_n_check.py --pool-lock` enforces this.
8+
#
9+
# Why a lock file
10+
# ---------------
11+
# Cross-project precedent (anonymized): an LLM reporting-quality SR carried
12+
# five documents that disagreed on INCLUDE (63 vs 64) and EXCLUDE
13+
# (108/109/111). Three EXCLUDE rows existed in the extraction sheet
14+
# without matching INCLUDE. The drift traced to a late round-3 adjudication
15+
# whose result was applied to some artifacts and not others.
16+
#
17+
# The lock file is the single source of truth. Once the freeze line is
18+
# crossed, NEVER re-derive the counts from raw artifacts in a downstream
19+
# script — always reference the lock.
20+
21+
# ---------------------------------------------------------------------------
22+
# Metadata
23+
# ---------------------------------------------------------------------------
24+
25+
# ISO-8601 date when the pool was frozen.
26+
freeze_date: "YYYY-MM-DD"
27+
28+
# Round at which freeze occurred — typically "round_3_adjudication".
29+
freeze_stage: "round_3_adjudication"
30+
31+
# Freeform note describing which screening sheet anchored this lock.
32+
provenance:
33+
screening_artifact: "2_Screening/round3_adjudication.tsv"
34+
adjudicator: "first_reviewer"
35+
ai_assisted_round: false # set true if AI pre-screening was used per SKILL.md Phase 3c
36+
37+
# ---------------------------------------------------------------------------
38+
# Counts (canonical numbers — NEVER edit without re-freezing)
39+
# ---------------------------------------------------------------------------
40+
41+
# Studies in the final pool (Phase 4 extraction candidate set).
42+
final_pool_n: 0
43+
44+
# Total INCLUDE decisions across rounds (post-adjudication).
45+
include_count: 0
46+
47+
# Total EXCLUDE decisions (full-text excluded).
48+
exclude_count: 0
49+
50+
# Mixed (eligible for some outcomes, excluded for others).
51+
mixed_count: 0
52+
53+
# ---------------------------------------------------------------------------
54+
# Identifier sets
55+
# ---------------------------------------------------------------------------
56+
57+
# UID lists. Use stable IDs (PMID, DOI, or screening-sheet record ID).
58+
include_uids: []
59+
exclude_uids: []
60+
mixed_uids: []
61+
62+
# ---------------------------------------------------------------------------
63+
# Integrity hash
64+
# ---------------------------------------------------------------------------
65+
66+
# SHA-256 of the sorted include_uids + exclude_uids + mixed_uids list,
67+
# joined with newlines. Provides tamper-evidence: any single UID edit
68+
# changes the hash. Recompute with:
69+
# python -c 'import hashlib; ids = sorted(open("..._uids.txt").read().splitlines()); print(hashlib.sha256("\n".join(ids).encode()).hexdigest())'
70+
sha256: ""

0 commit comments

Comments
 (0)