Skip to content

Commit 72d96ed

Browse files
committed
Rework README to lead with usage context
1 parent 16378f5 commit 72d96ed

1 file changed

Lines changed: 59 additions & 55 deletions

File tree

README.md

Lines changed: 59 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -11,52 +11,24 @@
1111

1212
---
1313

14-
## First Look
14+
## Where pcr Fits
1515

16-
```bash
17-
git clone https://github.com/nufegia/pre-check-research.git
18-
cd pre-check-research
19-
python3 -m pip install -e ".[dev]"
20-
mkdir -p build
21-
22-
# Inspect which checks apply to an example summary-stat table
23-
pcr-audit route examples/summary_stat_sample.csv --json build/route.json
24-
25-
# Run the applicable checks and write human + machine-readable reports
26-
pcr-audit run examples/summary_stat_sample.csv --out build/audit.md --json build/audit.json
27-
28-
# Read the human report
29-
cat build/audit.md
30-
```
31-
32-
This example is a smoke test, not a complete review. It shows how `pcr` records route decisions, runs applicable checks, and emits reports. A serious pre-submission or editorial audit usually requires assembling source tables, manuscript text, figure originals, analysis scripts, references, and provenance context, then interpreting each finding against the study design and source files.
33-
34-
For a full package, use a project folder:
35-
36-
```bash
37-
pcr-audit project path/to/project_folder --out build/project.md --json build/project.json
38-
```
16+
Use `pcr` when a research package needs a documented pre-submission or pre-decision screen before it moves to a journal, supervisor, editor, institution, client, or public record.
3917

40-
For R-backed statistical checks (GRIM, GRIMMER, DEBIT, SPRITE, statcheck):
18+
It is designed for review workflows where the reviewer has source materials available: manuscript text, source tables, raw or summary data, analysis scripts, figures, references, and provenance context. The goal is to surface reproducible review leads early enough that a human expert can verify, resolve, or explain them.
4119

42-
```bash
43-
export PATH="$PWD/tools/r/pcr_statcheck:$PWD/tools/r/pcr_scrutiny:$PWD/tools/r/pcr_sprite:$PATH"
44-
```
20+
Common fits:
4521

46-
---
22+
- **Before submission**: researchers and PIs check tables, data, figures, references, and scripts before sending a manuscript out.
23+
- **Before acceptance or sign-off**: labs, editorial offices, and institutions run a consistent screening checklist across packages.
24+
- **During triage**: reviewers separate concrete, reproducible signals from speculation, missing material, dependency gaps, and extraction artifacts.
25+
- **With AI agents**: agent workflows consume route decisions and schema-bound findings instead of guessing which checks apply.
4726

4827
## The Problem
4928

5029
Research integrity screening is time-consuming, error-prone, and often happens too late — after submission, during peer review, or post-publication. Journals, labs, and institutions need reliable, reproducible pre-submission checks, but existing tools are scattered across languages (R, Python), lack a unified output format, and require expert judgment to route the right tool to the right material.
5130

52-
## Why Teams Use It
53-
54-
- **Before submission**: find statistical, data, figure, reference, and code-review leads while there is still time to fix or explain them.
55-
- **Before acceptance or sign-off**: run a consistent checklist across manuscript packages instead of relying on ad hoc manual inspection.
56-
- **During triage**: separate reproducible signals from speculation, missing material, dependency gaps, and extraction artifacts.
57-
- **With AI agents**: give agents route decisions and schema-bound findings so they can summarize evidence without inventing tool applicability.
58-
59-
## What pcr Does
31+
## What pcr Solves
6032

6133
`pcr` solves the routing, execution, and reporting problem:
6234

@@ -69,6 +41,31 @@ Research integrity screening is time-consuming, error-prone, and often happens t
6941
| **Multi-material projects** | Audit entire research packages: data + manuscript + code + figures + references |
7042
| **Graceful degradation** | Missing R packages or optional dependencies become `info` records, not failures |
7143

44+
## What You Get
45+
46+
A run produces two complementary outputs:
47+
48+
- **Markdown report** for humans: findings grouped by tool, evidence, possible normal explanations, review steps, confidence, and limitations.
49+
- **JSON report** for systems: schema-bound findings that can be merged, archived, diffed, or handed to an AI-assisted review workflow.
50+
51+
The report is designed to support a defensible review process. It does not replace source-material verification, subject-matter judgment, or statistical consultation.
52+
53+
## Review Boundaries
54+
55+
- Findings are **risk signals**, not misconduct conclusions.
56+
- Missing tools, missing dependencies, skipped checks, and unsupported material are recorded as `level: info`.
57+
- PDF/DOCX extraction can introduce recognition errors; important findings should be verified against source CSV/XLSX, original figures, and manuscript tables.
58+
- Python/R scripts can be rerun in a temporary copy; Stata/SPSS/SAS scripts are scanned read-only and flagged for controlled manual rerun.
59+
60+
## Who Is This For
61+
62+
- **Researchers & PIs**: screen your own manuscript package before submission and fix issues while there is still time.
63+
- **Journal editorial offices**: standardize pre-acceptance screening across submissions with reproducible, documented checks.
64+
- **Research integrity officers**: triage cases with explainable leads rather than starting from scratch.
65+
- **Peer reviewers**: supplement review with automated consistency checks.
66+
- **Methods educators**: teach statistical error detection with concrete, runnable examples.
67+
- **AI agent pipelines**: use structured JSON route decisions and findings instead of prompt-only tool selection.
68+
7269
## Supported Checks
7370

7471
| Category | Tools | What it catches |
@@ -89,24 +86,6 @@ Research integrity screening is time-consuming, error-prone, and often happens t
8986

9087
[`benchmark/BENCHMARK.md`](benchmark/BENCHMARK.md) · [`benchmark/BENCHMARK_REPORT.md`](benchmark/BENCHMARK_REPORT.md)
9188

92-
## What You Get
93-
94-
A run produces two complementary outputs:
95-
96-
- **Markdown report** for humans: findings grouped by tool, evidence, possible normal explanations, review steps, confidence, and limitations.
97-
- **JSON report** for systems: schema-bound findings that can be merged, archived, diffed, or handed to an AI-assisted review workflow.
98-
99-
The report is designed to support a defensible review process. It does not replace source-material verification, subject-matter judgment, or statistical consultation.
100-
101-
## Who Is This For
102-
103-
- **Researchers & PIs**: Screen your own manuscript package before submission — catch issues early, avoid desk rejection or post-publication correction
104-
- **Journal editorial offices**: Standardize pre-acceptance screening across submissions with reproducible, documented checks
105-
- **Research integrity officers**: Triage cases with explainable leads rather than starting from scratch
106-
- **Peer reviewers**: Supplement your review with automated consistency checks
107-
- **Methods educators**: Teach statistical error detection with concrete, runnable examples
108-
- **AI agent pipelines**: `pcr` emits structured JSON designed for downstream agentic workflows; agents read route decisions and findings, not guess tool applicability
109-
11089
## Architecture
11190

11291
```
@@ -132,6 +111,8 @@ Unified output (models.py / reporting.py)
132111
## Install
133112

134113
```bash
114+
git clone https://github.com/nufegia/pre-check-research.git
115+
cd pre-check-research
135116
python3 -m pip install -e ".[dev]"
136117
export PATH="$PWD/tools/r/pcr_statcheck:$PWD/tools/r/pcr_scrutiny:$PWD/tools/r/pcr_sprite:$PATH"
137118
```
@@ -148,6 +129,29 @@ Optional image forensics:
148129
python3 -m pip install -e ".[image]"
149130
```
150131

132+
## Try It
133+
134+
```bash
135+
mkdir -p build
136+
137+
# Inspect which checks apply to an example summary-stat table
138+
pcr-audit route examples/summary_stat_sample.csv --json build/route.json
139+
140+
# Run the applicable checks and write human + machine-readable reports
141+
pcr-audit run examples/summary_stat_sample.csv --out build/audit.md --json build/audit.json
142+
143+
# Read the human report
144+
cat build/audit.md
145+
```
146+
147+
This example is a smoke test, not a complete review. It shows how `pcr` records route decisions, runs applicable checks, and emits reports. A serious pre-submission or editorial audit usually requires assembling source tables, manuscript text, figure originals, analysis scripts, references, and provenance context, then interpreting each finding against the study design and source files.
148+
149+
For a full package, use a project folder:
150+
151+
```bash
152+
pcr-audit project path/to/project_folder --out build/project.md --json build/project.json
153+
```
154+
151155
## Usage
152156

153157
### Single-file audit

0 commit comments

Comments
 (0)