Releases: Travor278/DocFailBench
Releases · Travor278/DocFailBench
DocFailBench v0.1 Combined Public RC
DocFailBench v0.1 Combined Public RC
DocFailBench is a failure-oriented benchmark for PDF-to-Markdown, OCR, and VLM document parsers.
Instead of asking whether a parsed page looks roughly similar, this release checks small, auditable facts: table cells, formulas, reading order, captions, page furniture, and optional bbox grounding.
Frozen target
- Release:
DocFailBench-v0.1-combined-public-rc - Cases: 116
- Assertions: 877
- Cached parser baselines: 7
- Recommended cases file:
data/releases/docfailbench_v0_1_combined_public_rc_cases.json
Baseline snapshot
| Parser | Passed | Failed | Score |
|---|---|---|---|
| Marker | 621 | 256 | 0.7081 |
| PyMuPDF bbox | 612 | 265 | 0.6978 |
| Docling | 599 | 278 | 0.6830 |
| PyMuPDF plain | 589 | 288 | 0.6716 |
| Qwen-VL API | 559 | 318 | 0.6374 |
| MinerU | 496 | 381 | 0.5656 |
| PaddleOCR | 334 | 543 | 0.3808 |
Verify cached scores
powershell -ExecutionPolicy Bypass -File scripts\run_combined_public_compare.ps1Submit a parser
Open an issue or PR with parser version, exact command, prediction JSON, result JSON, and runtime metadata. See docs/submitting-parser-results.md.
Source PDFs are not bundled in git; use the source manifests and fetch/document URLs for reproducibility.