Skip to content

Commit 2604ea3

Browse files
authored
Update benchmark dataset links in eval-results.md (#2118)
1 parent a5b09bf commit 2604ea3

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/hub/eval-results.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ The Hub provides a decentralized system for tracking model evaluation results. B
77

88
## Benchmark Datasets
99

10-
Dataset repos can be defined as **Benchmarks** (e.g., [AIME](https://huggingface.co/datasets/aime-ai/aime), [HLE](https://huggingface.co/datasets/cais/hle), [GPQA](https://huggingface.co/datasets/Idavidrein/gpqa)). These display a "Benchmark" tag and automatically aggregate evaluation results from model repos across the Hub and display a leaderboard of top models.
10+
Dataset repos can be defined as **Benchmarks** (e.g., [AIME](https://huggingface.co/datasets/OpenEvals/aime_24), [HLE](https://huggingface.co/datasets/cais/hle), [GPQA](https://huggingface.co/datasets/Idavidrein/gpqa)). These display a "Benchmark" tag and automatically aggregate evaluation results from model repos across the Hub and display a leaderboard of top models.
1111

1212
![Benchmark Dataset](https://huggingface.co/huggingface/documentation-images/resolve/main/evaluation-results/benchmark-preview.png)
1313

@@ -82,4 +82,4 @@ Anyone can submit evaluation results to any model via Pull Request:
8282
3. Add a `.eval_results/*.yaml` file with your results.
8383
4. The PR will show as "community-provided" on the model page while open.
8484

85-
For help evaluating a model, see the [Evaluating models with Inspect](https://huggingface.co/docs/inference-providers/guides/evaluation-inspect-ai) guide.
85+
For help evaluating a model, see the [Evaluating models with Inspect](https://huggingface.co/docs/inference-providers/guides/evaluation-inspect-ai) guide.

0 commit comments

Comments
 (0)