You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,10 @@
1
1
# CLAUDE.md — ocr-bench
2
2
3
-
OCR model evaluation toolkit. VLM-as-judge with per-dataset leaderboards on Hugging Face Hub.
3
+
OCR model evaluation toolkit. Answers: **"Which OCR model works best for MY documents?"**
4
+
5
+
Rankings change by document type — the best model for manuscript cards is different from the best for printed books or historical texts. This tool creates per-collection leaderboards using pairwise VLM-as-judge comparisons, so users can find what works for their specific documents.
6
+
7
+
**Pipeline**: `run` (launch OCR models via HF Jobs) → `judge` (pairwise VLM comparison → Bradley-Terry ELO) → `view` (leaderboard + human validation). Everything lives on the Hugging Face Hub — no local GPU needed.
0 commit comments