nufegia
diff --git a/‎CHANGELOG.md‎
Lines changed: 7 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎benchmark/BENCHMARK.md‎
Lines changed: 2 additions & 2 deletions b/‎benchmark/BENCHMARK.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎benchmark/BENCHMARK_REPORT.md‎
Lines changed: 14 additions & 14 deletions b/‎benchmark/BENCHMARK_REPORT.md‎
Lines changed: 14 additions & 14 deletions
diff --git a/‎benchmark/benchmark_manifest.json‎
Lines changed: 2 additions & 2 deletions b/‎benchmark/benchmark_manifest.json‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎benchmark/generate_synthetic_benchmark.py‎
Lines changed: 1 addition & 2 deletions b/‎benchmark/generate_synthetic_benchmark.py‎
Lines changed: 1 addition & 2 deletions
diff --git a/‎benchmark/ground_truth.json‎
Lines changed: 2 additions & 2 deletions b/‎benchmark/ground_truth.json‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎benchmark/reports/pcr.benchmark_summary.json‎
Lines changed: 12 additions & 12 deletions b/‎benchmark/reports/pcr.benchmark_summary.json‎
Lines changed: 12 additions & 12 deletions
diff --git a/‎benchmark/reports/pcr.benchmark_summary.md‎
Lines changed: 14 additions & 14 deletions b/‎benchmark/reports/pcr.benchmark_summary.md‎
Lines changed: 14 additions & 14 deletions
@@ -1,5 +1,12 @@
 # Changelog
 
+## v1.2.0
+
+- Added public report export support with schema metadata and CLI coverage.
+- Expanded project audit checks for references, image signals, routing, raw data, and summary-stat crosschecks.
+- Improved benchmark fixtures, benchmark reports, and documentation for the updated audit behavior.
+- Added development tooling configuration for pytest, ruff, mypy, and type stubs.
+
 ## v1.1.0
 
 - Renamed the project to `pre-check-research` and standardized the `pcr` CLI/output prefix.
 
@@ -75,7 +75,7 @@ The report is designed to support a defensible review process. It does not repla
 | **P-values** | `p_value_collection` | Domain validity (p outside [0,1]), just-significant clustering |
 | **Statistical text** | `statcheck` | APA/NHST in-text statistic vs reported p-value consistency |
 | **Images** | `image_audit` | Internal duplicates (aHash/dHash/pHash), rotated/flipped copies, copy-move triage, western blot/gel review |
-| **References** | `reference_audit` | DOI/PMID parsing, Crossref/OpenAlex/NCBI metadata queries, citation claim extraction |
+| **References** | `reference_audit` | DOI/PMID parsing, Crossref/OpenAlex/PubPeer/NCBI metadata queries, citation claim extraction |
 | **Code** | `code_audit`, sandbox | Pattern scanning (hardcoded paths, exclusion clues), Python/R script rerun with output capture |
 | **Corpus** | `corpus_signals` | Cross-manuscript text similarity (simhash, Jaccard), reference overlap, papermill phrase signals |
 | **Provenance** | `provenance` | SHA-256 file hashing, append-only JSONL ledger, verify/diff change detection |
@@ -283,7 +283,7 @@ Built-in example projects for testing and demonstration:
 ## Privacy and Security
 
 - **All computation is local.** No data is uploaded to external services.
-- External lookups (Crossref, OpenAlex, NCBI) only query public identifiers (DOI, PMID) and can be disabled with `--no-external-lookups`.
+- External lookups (Crossref, OpenAlex, PubPeer, NCBI) only query public identifiers (DOI, PMID) and can be disabled with `--no-external-lookups`.
 - Code reruns execute in temporary project copies with timeouts and minimal environment variables. This is not a strong security sandbox — treat unknown code accordingly.
 - SHA-256 provenance ledgers are append-only and never transmit file contents.
 
 
@@ -20,7 +20,7 @@ From the repo root:
 python3 benchmark/run_benchmark.py
 ```
 
-To skip external Crossref/OpenAlex/NCBI calls:
+To skip external Crossref/OpenAlex/PubPeer/NCBI calls:
 
 ```bash
 python3 benchmark/run_benchmark.py --no-network
@@ -36,7 +36,7 @@ python3 benchmark/run_benchmark.py --regenerate
 
 The suite covers raw data rules, including digit distribution, high-similarity rows/columns, column relationships, rare categories, and ordinal concentration; summary-stat crosscheck; R scrutiny; R statcheck; R rsprite2; p-value collection checks; reference parsing; external metadata lookup; citation claim extraction; papermill light/network signals; image duplicate/copy-move/metadata review; code scan/rerun; unsupported code recording; data trace crosscheck; provenance record/verify; and local corpus screening.
 
-Network coverage uses `inputs/project_external` and expects evidence from Crossref, OpenAlex, and NCBI. Network failures should be interpreted separately from detector regressions because external APIs can be unavailable, rate-limited, or return changed metadata.
+Network coverage uses `inputs/project_external` and expects evidence from Crossref, OpenAlex, PubPeer, and NCBI. Network failures should be interpreted separately from detector regressions because external APIs can be unavailable, rate-limited, require credentials, or return changed metadata.
 
 ## Interpretation
 
 
@@ -16,7 +16,7 @@ Conclusion: The core detection pipeline is stably covered by automated benchmark
 - Raw data: Covers duplicate/highly similar rows and columns, fixed steps, high-frequency values, missing-concentrated-by-group, terminal digit distribution, inter-column relationships, and non-continuous variable anomalies; clean controls maintain 0 risk signals.
 - Summary statistics: Covers SE/SD/N, CI, percent/count, p/t/df, p-value domain, and R scrutiny/SPRITE feasibility checks.
 - In-text statistics: Covers R statcheck p-value consistency checks on APA/NHST expressions.
-- Literature & network: Covers DOI/PMID parsing, Crossref/OpenAlex/NCBI metadata queries, and citation claim extraction.
+- Literature & network: Covers DOI/PMID parsing, Crossref/OpenAlex/PubPeer/NCBI metadata queries, and citation claim extraction.
 - Images: Covers image discovery, internal duplicates, local copy-move, metadata quality, and Western blot/gel review checklist.
 - Code & project: Covers Python/R script reruns, Stata/SPSS/SAS read-only prompts, cross-material data reconciliation, project manifest, provenance version chain, and local corpus screening.
 
@@ -36,18 +36,18 @@ Not executed (--no-network used).
 
 | Case | Type | Pass | Seconds | Risk Signals | Info | Missing Tools | Missing Checks |
 |---|---:|---:|---:|---:|---|---|
-| raw_suspicious | single_run | Yes | 1.16 | 16 | 0 |  |  |
-| raw_clean_control | single_run | Yes | 1.046 | 0 | 0 |  |  |
-| summary_suspicious | single_run | Yes | 2.083 | 17 | 2 |  |  |
-| p_values_suspicious | single_run | Yes | 1.012 | 2 | 0 |  |  |
-| apa_stats_suspicious | single_run | Yes | 2.193 | 2 | 0 |  |  |
-| paper_refs_and_claims_offline | single_run | Yes | 1.011 | 0 | 4 |  |  |
-| analysis_suspicious | single_run | Yes | 1.378 | 1 | 1 |  |  |
-| analysis_manual_unsupported | single_run | Yes | 1.012 | 0 | 3 |  |  |
-| figures_project | project | Yes | 1.205 | 11 | 13 |  |  |
-| project_full | project | Yes | 2.555 | 12 | 19 |  |  |
-| corpus_screen | corpus | Yes | 2.355 | 4 | 0 |  |  |
-| provenance_change | provenance_change | Yes | 2.067 | 1 | 5 |  |  |
+| raw_suspicious | single_run | Yes | 1.284 | 16 | 0 |  |  |
+| raw_clean_control | single_run | Yes | 1.147 | 0 | 0 |  |  |
+| summary_suspicious | single_run | Yes | 2.279 | 17 | 2 |  |  |
+| p_values_suspicious | single_run | Yes | 1.039 | 2 | 0 |  |  |
+| apa_stats_suspicious | single_run | Yes | 2.156 | 2 | 0 |  |  |
+| paper_refs_and_claims_offline | single_run | Yes | 1.072 | 0 | 4 |  |  |
+| analysis_suspicious | single_run | Yes | 1.42 | 1 | 1 |  |  |
+| analysis_manual_unsupported | single_run | Yes | 1.054 | 0 | 3 |  |  |
+| figures_project | project | Yes | 1.201 | 11 | 13 |  |  |
+| project_full | project | Yes | 2.471 | 12 | 19 |  |  |
+| corpus_screen | corpus | Yes | 2.104 | 4 | 0 |  |  |
+| provenance_change | provenance_change | Yes | 2.072 | 1 | 5 |  |  |
 | external_refs_online | project_network | Yes | 0.0 | 0 | 0 |  |  |
 
 ## Tool Coverage
@@ -93,5 +93,5 @@ Not executed (--no-network used).
 ## Interpretation Boundaries
 
 The high/medium/low levels in this report are benchmark risk signals, not conclusions of academic misconduct, fabrication, or fraud. `info` records are run statuses, dependency states, skip reasons, or coverage notes; they do not count toward risk conclusions.
-Network test cases depend on real-time availability, certificate chains, and rate limiting of Crossref, OpenAlex, and NCBI. If network cases fail, first check HTTP/SSL/rate-limit information in evidence before concluding it is a detector regression.
+Network test cases depend on real-time availability, certificate chains, credentials, and rate limiting of Crossref, OpenAlex, PubPeer, and NCBI. If network cases fail, first check HTTP/SSL/rate-limit information in evidence before concluding it is a detector regression.
 All weak-signal tools are only for surfacing human review directions. Final review should return to original data, scripts, image source files, literature metadata, and audit logs.
@@ -110,8 +110,8 @@
       "kind": "project_network",
       "input": "inputs/project_external",
       "expected_tools": ["reference_audit", "citation_claim_check", "provenance_hash"],
-      "expected_checks": ["DOI title mismatch", "PMID title mismatch", "DOI external metadata unverifiable", "PMID metadata verification"],
-      "expected_external_services": ["crossref", "openalex", "ncbi"],
+      "expected_checks": ["DOI title mismatch", "PMID title mismatch", "DOI external metadata absent", "PMID metadata verification"],
+      "expected_external_services": ["crossref", "openalex", "pubpeer", "ncbi"],
       "min_risk_findings": 3
     }
   ]
 
@@ -1,7 +1,6 @@
 from __future__ import annotations
 
 import json
-import math
 import shutil
 from pathlib import Path
 
@@ -283,7 +282,7 @@ def write_ground_truth() -> None:
                 "known_limitations": [
                     "R tool output depends on statcheck/scrutiny/rsprite2 parsing of column names and text formats.",
                     "Image copy-move is a weak signal; low-texture or regularly repeating graphics may produce false positives/negatives.",
-                    "Project-level external reference queries are disabled in this benchmark; Crossref/OpenAlex/NCBI network reliability is not tested.",
+                    "Project-level external reference queries are disabled in this benchmark; Crossref/OpenAlex/PubPeer/NCBI network reliability is not tested.",
                 ],
             },
             ensure_ascii=False,
 
@@ -23,6 +23,6 @@
   "known_limitations": [
     "R tool output depends on statcheck/scrutiny/rsprite2 parsing of column names and text formats.",
     "Image copy-move is a weak signal; low-texture or regularly repeating graphics may produce false positives/negatives.",
-    "Project-level external reference queries are disabled in this benchmark; Crossref/OpenAlex/NCBI network reliability is not tested."
+    "Project-level external reference queries are disabled in this benchmark; Crossref/OpenAlex/PubPeer/NCBI network reliability is not tested."
   ]
-}
+}
@@ -7,7 +7,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.16,
+      "seconds": 1.284,
       "json_path": "benchmark/reports/pcr.raw_suspicious.json",
       "markdown_path": "benchmark/reports/pcr.raw_suspicious.md",
       "risk_findings": 16,
@@ -38,7 +38,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.046,
+      "seconds": 1.147,
       "json_path": "benchmark/reports/pcr.raw_clean_control.json",
       "markdown_path": "benchmark/reports/pcr.raw_clean_control.md",
       "risk_findings": 0,
@@ -56,7 +56,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 2.083,
+      "seconds": 2.279,
       "json_path": "benchmark/reports/pcr.summary_suspicious.json",
       "markdown_path": "benchmark/reports/pcr.summary_suspicious.md",
       "risk_findings": 17,
@@ -90,7 +90,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.012,
+      "seconds": 1.039,
       "json_path": "benchmark/reports/pcr.p_values_suspicious.json",
       "markdown_path": "benchmark/reports/pcr.p_values_suspicious.md",
       "risk_findings": 2,
@@ -113,7 +113,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 2.193,
+      "seconds": 2.156,
       "json_path": "benchmark/reports/pcr.apa_stats_suspicious.json",
       "markdown_path": "benchmark/reports/pcr.apa_stats_suspicious.md",
       "risk_findings": 2,
@@ -135,7 +135,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.011,
+      "seconds": 1.072,
       "json_path": "benchmark/reports/pcr.paper_refs_and_claims_offline.json",
       "markdown_path": "benchmark/reports/pcr.paper_refs_and_claims_offline.md",
       "risk_findings": 0,
@@ -162,7 +162,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.378,
+      "seconds": 1.42,
       "json_path": "benchmark/reports/pcr.analysis_suspicious.json",
       "markdown_path": "benchmark/reports/pcr.analysis_suspicious.md",
       "risk_findings": 1,
@@ -186,7 +186,7 @@
       "kind": "single_run",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.012,
+      "seconds": 1.054,
       "json_path": "benchmark/reports/pcr.analysis_manual_unsupported.json",
       "markdown_path": "benchmark/reports/pcr.analysis_manual_unsupported.md",
       "risk_findings": 0,
@@ -211,7 +211,7 @@
       "kind": "project",
       "ok": true,
       "returncode": 0,
-      "seconds": 1.205,
+      "seconds": 1.201,
       "json_path": "benchmark/reports/pcr.figures_project.json",
       "markdown_path": "benchmark/reports/pcr.figures_project.md",
       "risk_findings": 11,
@@ -255,7 +255,7 @@
       "kind": "project",
       "ok": true,
       "returncode": 0,
-      "seconds": 2.555,
+      "seconds": 2.471,
       "json_path": "benchmark/reports/pcr.project_full.json",
       "markdown_path": "benchmark/reports/pcr.project_full.md",
       "risk_findings": 12,
@@ -308,7 +308,7 @@
       "kind": "corpus",
       "ok": true,
       "returncode": 0,
-      "seconds": 2.355,
+      "seconds": 2.104,
       "json_path": "benchmark/reports/pcr.corpus_screen.json",
       "markdown_path": "benchmark/reports/pcr.corpus_screen.md",
       "risk_findings": 4,
@@ -332,7 +332,7 @@
       "kind": "provenance_change",
       "ok": true,
       "returncode": 0,
-      "seconds": 2.067,
+      "seconds": 2.072,
       "json_path": "benchmark/reports/pcr.provenance_change.json",
       "markdown_path": ".",
       "risk_findings": 1,
 
@@ -16,7 +16,7 @@ Conclusion: The core detection pipeline is stably covered by automated benchmark
 - Raw data: Covers duplicate/highly similar rows and columns, fixed steps, high-frequency values, missing-concentrated-by-group, terminal digit distribution, inter-column relationships, and non-continuous variable anomalies; clean controls maintain 0 risk signals.
 - Summary statistics: Covers SE/SD/N, CI, percent/count, p/t/df, p-value domain, and R scrutiny/SPRITE feasibility checks.
 - In-text statistics: Covers R statcheck p-value consistency checks on APA/NHST expressions.
-- Literature & network: Covers DOI/PMID parsing, Crossref/OpenAlex/NCBI metadata queries, and citation claim extraction.
+- Literature & network: Covers DOI/PMID parsing, Crossref/OpenAlex/PubPeer/NCBI metadata queries, and citation claim extraction.
 - Images: Covers image discovery, internal duplicates, local copy-move, metadata quality, and Western blot/gel review checklist.
 - Code & project: Covers Python/R script reruns, Stata/SPSS/SAS read-only prompts, cross-material data reconciliation, project manifest, provenance version chain, and local corpus screening.
 
@@ -36,18 +36,18 @@ Not executed (--no-network used).
 
 | Case | Type | Pass | Seconds | Risk Signals | Info | Missing Tools | Missing Checks |
 |---|---:|---:|---:|---:|---|---|
-| raw_suspicious | single_run | Yes | 1.16 | 16 | 0 |  |  |
-| raw_clean_control | single_run | Yes | 1.046 | 0 | 0 |  |  |
-| summary_suspicious | single_run | Yes | 2.083 | 17 | 2 |  |  |
-| p_values_suspicious | single_run | Yes | 1.012 | 2 | 0 |  |  |
-| apa_stats_suspicious | single_run | Yes | 2.193 | 2 | 0 |  |  |
-| paper_refs_and_claims_offline | single_run | Yes | 1.011 | 0 | 4 |  |  |
-| analysis_suspicious | single_run | Yes | 1.378 | 1 | 1 |  |  |
-| analysis_manual_unsupported | single_run | Yes | 1.012 | 0 | 3 |  |  |
-| figures_project | project | Yes | 1.205 | 11 | 13 |  |  |
-| project_full | project | Yes | 2.555 | 12 | 19 |  |  |
-| corpus_screen | corpus | Yes | 2.355 | 4 | 0 |  |  |
-| provenance_change | provenance_change | Yes | 2.067 | 1 | 5 |  |  |
+| raw_suspicious | single_run | Yes | 1.284 | 16 | 0 |  |  |
+| raw_clean_control | single_run | Yes | 1.147 | 0 | 0 |  |  |
+| summary_suspicious | single_run | Yes | 2.279 | 17 | 2 |  |  |
+| p_values_suspicious | single_run | Yes | 1.039 | 2 | 0 |  |  |
+| apa_stats_suspicious | single_run | Yes | 2.156 | 2 | 0 |  |  |
+| paper_refs_and_claims_offline | single_run | Yes | 1.072 | 0 | 4 |  |  |
+| analysis_suspicious | single_run | Yes | 1.42 | 1 | 1 |  |  |
+| analysis_manual_unsupported | single_run | Yes | 1.054 | 0 | 3 |  |  |
+| figures_project | project | Yes | 1.201 | 11 | 13 |  |  |
+| project_full | project | Yes | 2.471 | 12 | 19 |  |  |
+| corpus_screen | corpus | Yes | 2.104 | 4 | 0 |  |  |
+| provenance_change | provenance_change | Yes | 2.072 | 1 | 5 |  |  |
 | external_refs_online | project_network | Yes | 0.0 | 0 | 0 |  |  |
 
 ## Tool Coverage
@@ -93,5 +93,5 @@ Not executed (--no-network used).
 ## Interpretation Boundaries
 
 The high/medium/low levels in this report are benchmark risk signals, not conclusions of academic misconduct, fabrication, or fraud. `info` records are run statuses, dependency states, skip reasons, or coverage notes; they do not count toward risk conclusions.
-Network test cases depend on real-time availability, certificate chains, and rate limiting of Crossref, OpenAlex, and NCBI. If network cases fail, first check HTTP/SSL/rate-limit information in evidence before concluding it is a detector regression.
+Network test cases depend on real-time availability, certificate chains, credentials, and rate limiting of Crossref, OpenAlex, PubPeer, and NCBI. If network cases fail, first check HTTP/SSL/rate-limit information in evidence before concluding it is a detector regression.
 All weak-signal tools are only for surfacing human review directions. Final review should return to original data, scripts, image source files, literature metadata, and audit logs.
Original file line number	Diff line number	Diff line change
`@@ -110,8 +110,8 @@`
`110`	`110`	`"kind": "project_network",`
`111`	`111`	`"input": "inputs/project_external",`
`112`	`112`	`"expected_tools": ["reference_audit", "citation_claim_check", "provenance_hash"],`
`113`		`- "expected_checks": ["DOI title mismatch", "PMID title mismatch", "DOI external metadata unverifiable", "PMID metadata verification"],`
`114`		`- "expected_external_services": ["crossref", "openalex", "ncbi"],`
	`113`	`+ "expected_checks": ["DOI title mismatch", "PMID title mismatch", "DOI external metadata absent", "PMID metadata verification"],`
	`114`	`+ "expected_external_services": ["crossref", "openalex", "pubpeer", "ncbi"],`
`115`	`115`	`"min_risk_findings": 3`
`116`	`116`	`}`
`117`	`117`	`]`
Original file line number	Diff line number	Diff line change
`@@ -23,6 +23,6 @@`
`23`	`23`	`"known_limitations": [`
`24`	`24`	`"R tool output depends on statcheck/scrutiny/rsprite2 parsing of column names and text formats.",`
`25`	`25`	`"Image copy-move is a weak signal; low-texture or regularly repeating graphics may produce false positives/negatives.",`
`26`		`- "Project-level external reference queries are disabled in this benchmark; Crossref/OpenAlex/NCBI network reliability is not tested."`
	`26`	`+ "Project-level external reference queries are disabled in this benchmark; Crossref/OpenAlex/PubPeer/NCBI network reliability is not tested."`
`27`	`27`	`]`
`28`		`-}`
	`28`	`+}`