Skip to content

Commit 5e329a0

Browse files
MilkCloudsclaude
andcommitted
leaderboard: docs cleanup — stale CLAUDE.md path, LFS contributor note
Two small fixes surfaced while auditing the leaderboard workflow docs: - leaderboard/CLAUDE.md referenced data/results.json, which has not existed since PR #35 split data files (the curated entries live in data/leaderboard.json now). - leaderboard/CONTRIBUTING.md had no mention of git-lfs, even though data/{leaderboard,extractions,scan_results}.json are LFS-tracked per data/.gitattributes. A first-time contributor without `git lfs install` reads pointer files and trips validate.py / build scripts. CI uses lfs: true on checkout (added in #46) but local setup was undocumented. Added a "Local Setup" section with the one-time install/pull commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2b51704 commit 5e329a0

2 files changed

Lines changed: 12 additions & 1 deletion

File tree

leaderboard/CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This directory is a **standalone static leaderboard site** — it is NOT vla-eva
44

55
## Key distinction
66

7-
- `leaderboard/data/results.json` contains scores **reported in published papers**. These are manually curated, not produced by running `vla-eval`.
7+
- `leaderboard/data/leaderboard.json` contains scores **reported in published papers**. These are manually curated or produced by the `extract.py` / `refine.py` pipeline — not produced by running `vla-eval`.
88
- `vla-eval` is the evaluation harness (the parent repo). Its runtime outputs go to `results/` or user-specified paths, never here.
99

1010
Do not conflate leaderboard data with vla-eval reproduced results. They are independent.

leaderboard/CONTRIBUTING.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,17 @@
22

33
> **Note on evaluation protocols:** Benchmark evaluation protocols are not fully standardized across the VLA community. Different papers may use the same benchmark name but differ in training regimes, task subsets, or evaluation conditions — making scores not always directly comparable. This leaderboard records all available results transparently and documents known protocol differences, but gaps remain. We actively welcome contributions: score corrections, missing results, protocol clarifications, and proposals for standardization.
44
5+
## Local Setup
6+
7+
`leaderboard/data/{leaderboard,extractions,scan_results}.json` are stored in **Git LFS** (see `leaderboard/data/.gitattributes`). Without LFS smudging, scripts will read pointer files and fail. Once per machine:
8+
9+
```
10+
git lfs install # install the LFS hooks
11+
git lfs pull # smudge any pointer files in the current checkout
12+
```
13+
14+
CI workflows (`pages.yml`, `update-data.yml`, `leaderboard-validate.yml`) already pass `lfs: true` to `actions/checkout`.
15+
516
## Data Structure
617

718
Data is split into focused files under `leaderboard/data/`:

0 commit comments

Comments
 (0)