Skip to content

Commit e3dda85

Browse files
bewestCopilot
andcommitted
feat: systematic verification of April 18-19 research reports
Report verification against experiment JSON data reveals 17 high-confidence errors across 4 of 5 reports. Key findings: REJECTED (do not publish): - Report 1 (egp-evidence-synthesis): 5 errors including p-value inflation 1000× - Report 2 (expanded-cohort-validation): 5 errors including 2 fabricated cohorts - Report 5 (controller-isf-signatures): 5 errors including all 12 patient IDs fabricated NEEDS FIXES: - Report 3 (sc-ceiling-demand-isf): 2 errors in fitted ceiling column (69% mismatch) APPROVED: - Report 4 (wall-resolution-mechanism): No errors found, pass for publication Full verification with line-by-line evidence in VERIFICATION-REPORT-2026-04-18-BATCH.md Quick reference guide in VERIFICATION-QUICK-REFERENCE-APR18.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 3dfd001 commit e3dda85

2 files changed

Lines changed: 615 additions & 0 deletions

File tree

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Quick Reference: April 18 Verification Findings
2+
3+
## Status Summary
4+
5+
| # | Report | Status | Errors | Action |
6+
|---|--------|--------|--------|--------|
7+
| 1 | egp-evidence-synthesis-report-2026-04-18.md | ❌ REJECT | 5 | **Retract & rebuild** |
8+
| 2 | expanded-cohort-validation-report-2026-04-18.md | ❌ REJECT | 5 | **Discard DynISF sections** |
9+
| 3 | sc-ceiling-demand-isf-report-2026-04-18.md | ⚠️ FIX | 2 | **Replace ceiling column** |
10+
| 4 | wall-resolution-mechanism-report-2026-04-18.md | ✅ PASS | 0 | **Publish as-is** |
11+
| 5 | controller-isf-signatures-report-2026-04-18.md | ❌ REJECT | 5 | **Rebuild with actual IDs** |
12+
13+
---
14+
15+
## Critical Errors by Report
16+
17+
### Report 1: EGP Evidence Synthesis
18+
19+
1. **P-value inflation** (Section 3.1): Claims p<10⁻¹⁹; actual ≈4×10⁻¹¹ (1000× overstatement)
20+
2. **Patient count** (Line 4): Claims 12; EXP-2656/2662 use 28 (2.3× error)
21+
3. **Selective reporting** (Section 3.6): Hides 3 of 9 patients with r≥0
22+
4. **Wrong counts** (3.3, 3.8, 3.9): Claims 12; actual 25, 28, 28 respectively
23+
24+
---
25+
26+
### Report 2: Expanded Cohort Validation
27+
28+
1. **Fabricated DynISF cohort** (EXP-2651): Claims 12 patients; JSON shows 25 identical to Original
29+
2. **Fabricated DynISF cohort** (EXP-2652): Claims 10 patients; JSON shows 18
30+
3. **Fabricated EXP-2640 table**: Rows c, e, g marked insufficient but data invented
31+
4. **H1 understatement**: EXP-2662 Original 7% vs actual 11.2% (60% error)
32+
5. **H1 understatement**: EXP-2662 DynISF 9% vs actual 13.7% (52% error)
33+
34+
---
35+
36+
### Report 3: SC Ceiling Demand ISF
37+
38+
1. **Fitted ceiling column**: 69% of values wrong; median 22.5% vs actual 13.9%
39+
2. **Range overstated**: 10-67% vs actual 10-55%
40+
41+
**Fix**: Replace fitted ceiling column with correct JSON values
42+
43+
---
44+
45+
### Report 4: Wall Resolution Mechanism
46+
47+
**No errors found** — All tables verified, all arithmetic correct
48+
49+
---
50+
51+
### Report 5: Controller ISF Signatures
52+
53+
1. **Fabricated patient IDs**: All 12 ns-* IDs don't exist (actual: a-k + odc-*)
54+
2. **Patient i contradiction**: Motivation mentions; table excludes without explanation
55+
3. **Hidden exclusions**: 5 of 17 patients excluded undisclosed (29% sample loss)
56+
4. **Artificial homogeneity**: Motivation promises multicontroller; results show 100% Trio/AB
57+
5. **Hypothesis inversions**: H1, H2, H3 all reported FAIL but actual PASS
58+
59+
---
60+
61+
## Key Statistics
62+
63+
- **Total reports**: 5
64+
- **Reports with errors**: 4 (80%)
65+
- **Total high-confidence errors**: 17
66+
- **Fabricated sections**: 3 (Reports 2, 5)
67+
- **Per-patient tables affected**: 2 (Reports 2, 5)
68+
- **Wrong patient counts**: 4 (Reports 1, 2, 3, 5)
69+
- **Hidden exclusions**: 2 (Reports 1, 5)
70+
- **P-value issues**: 1 (Report 1)
71+
72+
---
73+
74+
## Recommended Actions
75+
76+
**TODAY (do not publish)**:
77+
- Flag Reports 1, 2, 5 as requiring retraction/rebuild
78+
- Hold Report 3 pending ceiling column replacement
79+
80+
**TOMORROW (rebuild)**:
81+
- Report 1: Verify actual patient scope; remove inflated p-value claims
82+
- Report 2: Discard DynISF sections; rebuild EXP-2640 table from JSON
83+
- Report 5: Rebuild using actual patient IDs from JSON; include all 17 or document exclusion
84+
85+
**READY (approve)**:
86+
- Report 4: Publish as-is
87+
88+
---
89+
90+
## Where to Find Details
91+
92+
**Full verification report**: `VERIFICATION-REPORT-2026-04-18-BATCH.md`
93+
- Complete evidence for each error
94+
- JSON references and line numbers
95+
- Actual vs claimed values for every discrepancy
96+
97+
**This quick reference**: Error summary and remediation roadmap
98+

0 commit comments

Comments
 (0)