Anti-hallucination plugin for academic research in Claude Code. Create literature reviews and papers guaranteed without fabricated citations.
Citation hallucinations in academic writing — fabricated quotes, inverted attributions, misidentified authors — are systemic when the author writes from memory and looks up sources afterward. Retracted papers from major venues have shown the failure mode is structural, not anecdotal.
paper-trail makes the error mechanically impossible by enforcing
a research-first workflow with strict state transitions, automated PDF
acquisition with anti-homonymy validation, and per-citation audit
against the actual source text.
- Strict state machine (8 states) — no reference can be cited without traversing acquisition → page 1 validation → claim verification
- Anti-homonymy guard — page 1 of each downloaded PDF is matched against expected author, title, and domain keywords before acceptance
- 10-source acquisition cascade — Crossref OA, arXiv, OpenAlex,
Unpaywall, HAL, CORE, archive.org, WebSearch queue (+ Sci-Hub and
Anna's Archive available as strict opt-in, see
DISCLAIMER.md) - 19 mechanical invariants — automated registry health check with safe auto-fix for cosmetic drift
- Inverted writing workflow — researches and reads first, writes last; refuses to draft when source corpus is too thin
- Per-citation audit — generates
RECEIPTS.mdclassifying each citation as VALID / ADJUST / INVALID / UNVERIFIABLE - Vault-agnostic — works with Obsidian, flat markdown, or Zotero layouts (adapter pattern)
- Mechanical coverage guard — refuses to ship a new version without explicit test evidence for each component
paper-trail's acquisition cascade and search layer cover the major academic indexing and full-text platforms.
| Source | Coverage | Used for |
|---|---|---|
| Crossref | Cross-domain DOI registry, 150M+ records | DOI lookup, metadata, open-access URL |
| arXiv | Preprints (CS, math, physics, q-bio, q-fin, stats, EE) | Full-text PDFs of preprints |
| OpenAlex | 200M+ scholarly works, cross-domain aggregator | Metadata, abstracts, citation graph |
| Unpaywall | 30M+ free full-text articles | OA PDF discovery |
| HAL | French academic repository | Full-text, OA |
| CORE | UK-based aggregator, 200M+ OA records | Full-text fallback |
| archive.org | Digitized books and articles, Internet Archive | Books, older publications, scans |
| Semantic Scholar | AI-curated academic graph, 200M+ papers | Cross-reference, related papers |
| PubMed / PMC / bioRxiv / medRxiv | Biomedical, preprints | Biomedical full-text |
| Zenodo / SSRN / DBLP / DOAJ / BASE / IACR / EuropePMC | Cross-domain | Additional metadata and full-text |
| Source | Coverage | Activation |
|---|---|---|
| Sci-Hub | Paywalled scholarly literature, ~88M papers | RESEARCH_ENABLE_SHADOW_LIBS=1 |
| Anna's Archive | Books and articles, aggregates Library Genesis, Sci-Hub, Z-Library | RESEARCH_ENABLE_SHADOW_LIBS=1 |
Shadow-library activation is explicit and per-session. A
disclaimer prints to stderr on first use. The user is responsible
for legal compliance in their jurisdiction. See
DISCLAIMER.md.
| MCP | Coverage | Status |
|---|---|---|
| paper-search MCP | Unified API over 22 platforms above | Required for SOTA writing — install from git (see INSTALL.md) |
| NotebookLM MCP | Books corpus (Q&A with citations) | Optional — RESEARCH_ENABLE_NOTEBOOKLM=1 |
| RTFM MCP | Local indexed corpus (code, docs, research) | Optional — failure correlation only |
⚠️ Do not installpaper-search-mcpfrom PyPI — the published version (0.1.3) is severely outdated (13 / 63 tools, no OpenAlex, no Crossref, no Semantic Scholar). Use the git HEAD recipe inINSTALL.md.
In a Claude Code session:
/plugin install file:///path/to/paper-trail
Or via marketplace (when published):
/plugin marketplace add roomi-fields
/plugin install paper-trail
RESEARCH_VAULT_PATH is required — the plugin refuses to start
without it. Set it (and any reusable secret like S2_API_KEY) once
in ~/.config/paper-trail/env so every project picks them up
automatically :
mkdir -p ~/.config/paper-trail
cat > ~/.config/paper-trail/env <<'EOF'
RESEARCH_VAULT_PATH=~/Documents/MyResearch
RESEARCH_CONTACT_EMAIL=you@example.org
S2_API_KEY=s2k-...
EOF
chmod 600 ~/.config/paper-trail/envPer-project shell overrides take priority. For complete configuration,
see INSTALL.md.
python3 -m pipeline preflightReports vault, Python deps, paper-search MCP registration, and any
missing optional secret — with a copy-pastable fix command for each
problem.
/paper-trail:status # overview of the registry
/paper-trail:new-sota "your research topic"
/paper-trail:audit-article path/to/your/paper.tex
Three primary workflows, all enforced by the plugin's state machine and pre-write hooks:
/paper-trail:new-sota "topic"
- Multi-source search across 22 academic platforms via the
paper-searchMCP - You select candidate references from proposed matches
- Automated PDF acquisition with mandatory page 1 anti-homonymy validation
- Structured notes extracted into each reference's markdown body
- Final SOTA cites only references that reached
page1_validatedstate; rejected candidates are listed with reasons
/paper-trail:audit-sota path/to/SOTA.md [--purge]
/paper-trail:audit-article path/to/paper.tex [--warn]
Classifies each citation, optionally purges hallucinations from the
SOTA (with .bak backup), or inserts inline warnings adjacent to
problematic citations in the paper.
/paper-trail:cascade <slug> # acquire a specific reference
/paper-trail:doctor [--fix] # check and repair registry consistency
/paper-trail:reactivate-ocr # resume OCR-waiting refs
User → 9 slash commands → 6 skills → 4 sub-agents → Worker engine
↓
YAML registry
- Skills orchestrate (Claude Code markdown)
- Sub-agents isolate heavy work (PDF reading, search) from main context
- Worker engine (Python) enforces the FSM, runs the cascade, checks invariants — deterministic, testable
- YAML registry is the source of truth (one file per reference, Obsidian-compatible frontmatter)
See docs/ARCHITECTURE.md for the full
diagram and pipeline/ARCHITECTURE.md
for the worker engine internals.
docs/USAGE.md— daily workflowsdocs/ARCHITECTURE.md— system overviewdocs/LEGAL.md— licensing and attributionDISCLAIMER.md— shadow libraries opt-in policyNOTICE.md— third-party attributionsCHANGELOG.md— version history
| Variable | Default | Purpose |
|---|---|---|
RESEARCH_VAULT_PATH |
— (required) | Vault root — plugin refuses to start without it |
RESEARCH_SOURCES_PATH |
$VAULT/10_SOURCES |
PDF directory |
RESEARCH_REGISTRY_PATH |
$SOURCES/_registry |
YAML registry |
RESEARCH_VAULT_LAYOUT |
obsidian |
Adapter (obsidian / flat / zotero) |
RESEARCH_RTFM_DB |
unset | Optional RTFM SQLite DB for failure correlation |
RESEARCH_CONTACT_EMAIL |
anonymous@example.org |
Email sent to Crossref, Semantic Scholar (politeness) |
S2_API_KEY |
unset | Optional Semantic Scholar API key (stricter rate limit without) |
RESEARCH_ENABLE_SHADOW_LIBS |
unset | Enable Anna's Archive & Sci-Hub (opt-in, see DISCLAIMER) |
RESEARCH_ENABLE_NOTEBOOKLM |
unset | Enable NotebookLM in sota-writer phase A |
RESEARCH_SKIP_END_DOCTOR |
unset | Skip the SessionEnd consistency check |
The PROJECT_AUTHORS whitelist used by the Semantic Scholar resolver
(group names treated as anonymous in queries — e.g. "Anonymous", "Group")
is loaded from $XDG_CONFIG_HOME/paper-trail/project_authors.txt
(default ~/.config/paper-trail/project_authors.txt), one entry per line.
The file is optional; if absent the whitelist is empty.
MIT — see LICENSE.
This plugin builds on patterns and components from several open-source
projects in the Claude Code ecosystem. See NOTICE.md
for detailed attributions.
Issues and pull requests welcome at github.com/roomi-fields/paper-trail/issues. The plugin is in active development; structural changes may happen between minor versions before v1.0.