davanstrien
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 4 additions & 0 deletions b/‎README.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎assets/elo-scatter.png‎
86.9 KB b/‎assets/elo-scatter.png‎
86.9 KB
diff --git a/‎assets/leaderboard.png‎
189 KB b/‎assets/leaderboard.png‎
189 KB
@@ -6,6 +6,7 @@ dist/
 
 # Screenshots and temp files from dev/testing
 *.png
+!assets/*.png
 *.jpeg
 *.json
 !pyproject.toml
 
@@ -22,6 +22,8 @@ ocr-bench lets you run the same set of OCR models on a sample of _your_ collecti
 
 Rankings can flip completely between collections.
 
+![ELO vs Parameter Count — smaller models can win on the right documents](assets/elo-scatter.png)
+
 ## Hub-native by design
 
 The entire evaluation loop lives on the Hugging Face Hub:
@@ -79,6 +81,8 @@ ocr-bench run <dataset> <output> --models glm-ocr lighton-ocr-2
 
 ## Example results
 
+![Leaderboard viewer with ELO ratings, confidence intervals, and human validation](assets/leaderboard.png)
+
 Browse these on the Hub:
 
 - [davanstrien/ocr-bench-britannica-results-qwen35](https://huggingface.co/datasets/davanstrien/ocr-bench-britannica-results-qwen35) — Encyclopaedia Britannica 1771, 5 models, 50 samples