Skip to content

feat(verify-refs): v1.2.0 — full-author cross-check + PubMed efetch authoritative#18

Closed
Yoojin-nam wants to merge 1 commit into
mainfrom
feat/verify-refs-full-author
Closed

feat(verify-refs): v1.2.0 — full-author cross-check + PubMed efetch authoritative#18
Yoojin-nam wants to merge 1 commit into
mainfrom
feat/verify-refs-full-author

Conversation

@Yoojin-nam
Copy link
Copy Markdown
Contributor

Summary

Splits commit 0679c3e out of PR #10 (feat/academic-aio-phase-lift-dedupe), where it was riding under an unrelated academic-aio schema change. This is the verify-refs v1.2.0 work tracked by issue #17.

  • Full-author family-name cross-check (not just first-author) — catches the liu2026benchmarking failure mode where 7/10 given names were hallucinated but family names + first-author matched.
  • PubMed efetch.fcgi (XML) prioritized over CrossRef API for author retrieval — CrossRef has documented given-name drift (Aydin 2024 Vlachos: CrossRef "Vasileios" vs PubMed "Victoria", PubMed correct).
  • verify_refs.py +338/−68; touches lit-sync, search-lit, verify-refs SKILL.md + render_pandoc.sh + skill.yml.

Why draft

Issue #17 requires test coverage that this commit does not yet include:

Marking draft until the above lands. The implementation itself is complete and was cherry-picked verbatim (0679c3ee322236, clean — no conflicts against current main).

Provenance

The original session that authored 0679c3e is closed. This PR was reconstructed by cherry-picking the commit onto a clean branch from main (post-#13). PR #10 will be rebased to drop 0679c3e so it carries only the academic-aio schema change.

Closes #17 once the test checklist above is complete.

🤖 Generated with Claude Code

…uthoritative

verify-refs v1.1.x only checked the first author family name; #2..#N family
hallucinations passed audit (Paper 1 npj DM 2026-05-11 incident:
liu2026benchmarking registered with 10 names but 7/10 first names were
AI-fabricated — Yishu/Zifeng Ingram/Xue/Linhao/Samiran/Tobias/Zhijian — all
hallucinated and would have shipped to reviewers).

verify-refs/scripts/verify_refs.py:
- RefRecord extended with cited_authors[], actual_authors[], cited_author_count,
  actual_author_count, audit_truncated. Backward-compatible (first_author_guess
  populated from cited_authors[0] when available).
- parse_bib_authors() new — splits BibTeX `author` field by ' and ', strips
  LaTeX accents and braces. parse_bib() now uses balanced-brace parsing so
  entries like Park2025korean (author contains `Monta\~{n}\`{a}` LaTeX
  escape) extract all 4 names instead of stopping at the first inner `}`.
- verify_pubmed_efetch() new — authoritative XML full-record source. Reads
  <LastName> + <ForeName> per <Author ValidYN="Y"> element. Used in
  preference to CrossRef when PMID present (Paper 1 Aydin 2024 Vlachos
  incident: CrossRef returned given name `Vasileios`, PubMed/Zotero
  authoritative `Victoria` — PubMed efetch wins).
- verify_crossref(), verify_pubmed_pmid() now return full family-name list
  (not first only).
- verify_record() rewritten — PMID→efetch→CrossRef fallback, every cited
  author compared against actual_authors[i], total counts compared, MISMATCH
  status raised on any mismatch; note classification distinguishes
  first-author hallucination vs non-first-author/count mismatch.
- _normalize_surname() rewritten with unicodedata.NFKD decomposition +
  multi-char fallback table (ł đ ı ø æ œ ß). Eliminates Paper 1
  gunes2025textual `Çolakoğlu` vs PubMed `Colakoglu` false-positive.
- `_audit_truncated = true` BibTeX field — opt-in marker for intentional CSL
  truncation (e.g., Nature first-1+et-al rendering with only 6 of 57 authors
  in bib). Suppresses count-mismatch MISMATCH while still reporting the gap
  as a NOTE in evidence.

verify-refs/SKILL.md, skill.yml: version 1.2.0 + changelog section.

lit-sync/SKILL.md: post-BBT-export regression audit hook — run verify-refs
v1.2.0 on the refreshed .bib and block on MISMATCH so AI/BBT-introduced
hallucinations are caught before downstream consumers read the file.

search-lit/SKILL.md: anti-hallucination protocol updated — compare full
author list, not first+last only. Authoritative source priority documented:
PubMed efetch > esummary > CrossRef (CrossRef given names not authoritative).

manage-refs/scripts/render_pandoc.sh: pre-render verify-refs v1.2.0 audit
gate (-S flag to bypass). Blocks pandoc render on MISMATCH so a missed
hallucination cannot ship into a journal-formatted docx.

Tests: tests/test_phase1c_hooks.sh 12/12 pass post-change. Paper 1
paper1.bib 69-entry audit: 58 OK + 11 UNVERIFIED (arXiv, known FP) + 0
MISMATCH; submission_safe=true.

Refs:
- ~/.claude/rules/citation-safety.md v1.1.4
- Memory: paper1_npj_dm_r1_v2_circulation_dispatched_2026-05-11
- Memory: feedback_paper1_bib_stale_post_bbt_2026-05-11
- Memory: feedback_codex_review_design_hallucination_2026-05-11

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Yoojin-nam
Copy link
Copy Markdown
Contributor Author

Superseded by #23 — v1.2.0 full-author cross-check + PubMed efetch authoritative merged via PR #23 (2026-05-19). Closing this branch to remove duplicate v1.2.0 lineage.

@Yoojin-nam Yoojin-nam closed this May 23, 2026
@Yoojin-nam Yoojin-nam deleted the feat/verify-refs-full-author branch May 23, 2026 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

verify_refs.py: implement full-author PubMed efetch (citation-safety.md v1.2.0)

1 participant