Rework README to lead with usage context

nufegia · nufegia · commit 72d96edf597f · 2026-05-24T00:45:32.000+08:00
diff --git a/README.md b/README.md
@@ -11,52 +11,24 @@
 
 ---
 
-## First Look
+## Where pcr Fits
 
-```bash
-git clone https://github.com/nufegia/pre-check-research.git
-cd pre-check-research
-python3 -m pip install -e ".[dev]"
-mkdir -p build
-
-# Inspect which checks apply to an example summary-stat table
-pcr-audit route examples/summary_stat_sample.csv --json build/route.json
-
-# Run the applicable checks and write human + machine-readable reports
-pcr-audit run examples/summary_stat_sample.csv --out build/audit.md --json build/audit.json
-
-# Read the human report
-cat build/audit.md
-```
-
-This example is a smoke test, not a complete review. It shows how `pcr` records route decisions, runs applicable checks, and emits reports. A serious pre-submission or editorial audit usually requires assembling source tables, manuscript text, figure originals, analysis scripts, references, and provenance context, then interpreting each finding against the study design and source files.
-
-For a full package, use a project folder:
-
-```bash
-pcr-audit project path/to/project_folder --out build/project.md --json build/project.json
-```
+Use `pcr` when a research package needs a documented pre-submission or pre-decision screen before it moves to a journal, supervisor, editor, institution, client, or public record.
 
-For R-backed statistical checks (GRIM, GRIMMER, DEBIT, SPRITE, statcheck):
+It is designed for review workflows where the reviewer has source materials available: manuscript text, source tables, raw or summary data, analysis scripts, figures, references, and provenance context. The goal is to surface reproducible review leads early enough that a human expert can verify, resolve, or explain them.
 
-```bash
-export PATH="$PWD/tools/r/pcr_statcheck:$PWD/tools/r/pcr_scrutiny:$PWD/tools/r/pcr_sprite:$PATH"
-```
+Common fits:
 
----
+- **Before submission**: researchers and PIs check tables, data, figures, references, and scripts before sending a manuscript out.
+- **Before acceptance or sign-off**: labs, editorial offices, and institutions run a consistent screening checklist across packages.
+- **During triage**: reviewers separate concrete, reproducible signals from speculation, missing material, dependency gaps, and extraction artifacts.
+- **With AI agents**: agent workflows consume route decisions and schema-bound findings instead of guessing which checks apply.
 
 ## The Problem
 
 Research integrity screening is time-consuming, error-prone, and often happens too late — after submission, during peer review, or post-publication. Journals, labs, and institutions need reliable, reproducible pre-submission checks, but existing tools are scattered across languages (R, Python), lack a unified output format, and require expert judgment to route the right tool to the right material.
 
-## Why Teams Use It
-
-- **Before submission**: find statistical, data, figure, reference, and code-review leads while there is still time to fix or explain them.
-- **Before acceptance or sign-off**: run a consistent checklist across manuscript packages instead of relying on ad hoc manual inspection.
-- **During triage**: separate reproducible signals from speculation, missing material, dependency gaps, and extraction artifacts.
-- **With AI agents**: give agents route decisions and schema-bound findings so they can summarize evidence without inventing tool applicability.
-
-## What pcr Does
+## What pcr Solves
 
 `pcr` solves the routing, execution, and reporting problem:
 
@@ -69,6 +41,31 @@ Research integrity screening is time-consuming, error-prone, and often happens t
 | **Multi-material projects** | Audit entire research packages: data + manuscript + code + figures + references |
 | **Graceful degradation** | Missing R packages or optional dependencies become `info` records, not failures |
 
+## What You Get
+
+A run produces two complementary outputs:
+
+- **Markdown report** for humans: findings grouped by tool, evidence, possible normal explanations, review steps, confidence, and limitations.
+- **JSON report** for systems: schema-bound findings that can be merged, archived, diffed, or handed to an AI-assisted review workflow.
+
+The report is designed to support a defensible review process. It does not replace source-material verification, subject-matter judgment, or statistical consultation.
+
+## Review Boundaries
+
+- Findings are **risk signals**, not misconduct conclusions.
+- Missing tools, missing dependencies, skipped checks, and unsupported material are recorded as `level: info`.
+- PDF/DOCX extraction can introduce recognition errors; important findings should be verified against source CSV/XLSX, original figures, and manuscript tables.
+- Python/R scripts can be rerun in a temporary copy; Stata/SPSS/SAS scripts are scanned read-only and flagged for controlled manual rerun.
+
+## Who Is This For
+
+- **Researchers & PIs**: screen your own manuscript package before submission and fix issues while there is still time.
+- **Journal editorial offices**: standardize pre-acceptance screening across submissions with reproducible, documented checks.
+- **Research integrity officers**: triage cases with explainable leads rather than starting from scratch.
+- **Peer reviewers**: supplement review with automated consistency checks.
+- **Methods educators**: teach statistical error detection with concrete, runnable examples.
+- **AI agent pipelines**: use structured JSON route decisions and findings instead of prompt-only tool selection.
+
 ## Supported Checks
 
 | Category | Tools | What it catches |
@@ -89,24 +86,6 @@ Research integrity screening is time-consuming, error-prone, and often happens t
 
 [`benchmark/BENCHMARK.md`](benchmark/BENCHMARK.md) · [`benchmark/BENCHMARK_REPORT.md`](benchmark/BENCHMARK_REPORT.md)
 
-## What You Get
-
-A run produces two complementary outputs:
-
-- **Markdown report** for humans: findings grouped by tool, evidence, possible normal explanations, review steps, confidence, and limitations.
-- **JSON report** for systems: schema-bound findings that can be merged, archived, diffed, or handed to an AI-assisted review workflow.
-
-The report is designed to support a defensible review process. It does not replace source-material verification, subject-matter judgment, or statistical consultation.
-
-## Who Is This For
-
-- **Researchers & PIs**: Screen your own manuscript package before submission — catch issues early, avoid desk rejection or post-publication correction
-- **Journal editorial offices**: Standardize pre-acceptance screening across submissions with reproducible, documented checks
-- **Research integrity officers**: Triage cases with explainable leads rather than starting from scratch
-- **Peer reviewers**: Supplement your review with automated consistency checks
-- **Methods educators**: Teach statistical error detection with concrete, runnable examples
-- **AI agent pipelines**: `pcr` emits structured JSON designed for downstream agentic workflows; agents read route decisions and findings, not guess tool applicability
-
 ## Architecture
 
 ```
@@ -132,6 +111,8 @@ Unified output (models.py / reporting.py)
 ## Install
 
 ```bash
+git clone https://github.com/nufegia/pre-check-research.git
+cd pre-check-research
 python3 -m pip install -e ".[dev]"
 export PATH="$PWD/tools/r/pcr_statcheck:$PWD/tools/r/pcr_scrutiny:$PWD/tools/r/pcr_sprite:$PATH"
 ```
@@ -148,6 +129,29 @@ Optional image forensics:
 python3 -m pip install -e ".[image]"
 ```
 
+## Try It
+
+```bash
+mkdir -p build
+
+# Inspect which checks apply to an example summary-stat table
+pcr-audit route examples/summary_stat_sample.csv --json build/route.json
+
+# Run the applicable checks and write human + machine-readable reports
+pcr-audit run examples/summary_stat_sample.csv --out build/audit.md --json build/audit.json
+
+# Read the human report
+cat build/audit.md
+```
+
+This example is a smoke test, not a complete review. It shows how `pcr` records route decisions, runs applicable checks, and emits reports. A serious pre-submission or editorial audit usually requires assembling source tables, manuscript text, figure originals, analysis scripts, references, and provenance context, then interpreting each finding against the study design and source files.
+
+For a full package, use a project folder:
+
+```bash
+pcr-audit project path/to/project_folder --out build/project.md --json build/project.json
+```
+
 ## Usage
 
 ### Single-file audit