You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
revert: drop DataRequirement / vla-eval data fetch / cache_key abstractions
Re-orient PR #58 around a smaller infrastructure change. The
``vla-eval data fetch`` subcommand + ``DataRequirement`` declarative
metadata layer were over-built for the actual lifecycle: the
license-acceptance handshake doesn't need to be a separate pre-flight
step; it can be runtime, prompted on first need, just like model-server
git clones already do. Moving the licence confirmation to runtime
collapses the asymmetry between benchmark-asset fetch and model-server
clone fetch — both become lazy, both go through the same primitives.
This commit removes the abstraction. The next commit adds the
runtime-licence flow and the unified host-cache resolver.
Removed:
- ``src/vla_eval/cli/cmd_data.py`` (the ``vla-eval data fetch``
subcommand and its docker-side fetch dispatch).
- ``DataRequirement`` dataclass and ``Benchmark.data_requirements``
classmethod on ``src/vla_eval/benchmarks/base.py``.
- ``Behavior1KBenchmark.data_requirements`` method.
- ``cmd_data.register(sub)`` wiring in ``cli/main.py``.
Reverted to the PR #57 baseline:
- ``configs/behavior1k_eval.yaml`` — the data-fetch comment block and
the OmegaConf volume interpolation; the next commit puts the
interpolation back in extended XDG-aware form.
- ``docs/reproductions/behavior1k.md`` step 2.
- ``.claude/skills/add-benchmark/SKILL.md`` ``data_requirements``
section.
Kept (independent improvements that survive this rewrite):
- ``cli/_console.py``, ``cli/_docker.py`` (helper hoists).
- ``cli/config_loader.py`` always-on OmegaConf interpolation.
- ``Behavior1KBenchmark.task_instance_id`` per-episode sweep.
- Demo-replay per-(session, episode) cursor + ``on_episode_start``
fail-loud hook.
- ``Behavior1KBenchmark.get_metadata`` declaring ``action_dim=23``.
- README "Build-locally images" caption + 🔒 marker on rlbench/
behavior1k rows; CONTRIBUTING benchmark roster refresh.
Copy file name to clipboardExpand all lines: .claude/skills/add-benchmark/SKILL.md
-29Lines changed: 0 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -121,35 +121,6 @@ class MyBenchmark(StepBenchmark):
121
121
-**Image preprocessing**: Handle non-standard images (flipped, wrong resolution) in `make_obs()`.
122
122
-**EGL headless rendering**: Add `os.environ.setdefault("PYOPENGL_PLATFORM", "egl")` at module top if the sim uses OpenGL.
123
123
124
-
### Optional: external dataset declaration
125
-
126
-
If the benchmark's dataset is licensed independently and shouldn't be baked into the docker image, override `data_requirements()` (classmethod) so the harness's uniform fetch path picks it up:
127
-
128
-
```python
129
-
from vla_eval.benchmarks.base import DataRequirement
Users then run `vla-eval data fetch -c configs/<name>_eval.yaml --accept-license <license_id>` once. The fetcher mounts `${VLA_EVAL_DATA_DIR:-~/.cache/vla-eval}/<cache_key>` read-write at `container_data_path` and runs `download_command`. The eval config's `volumes:` entry should mount the same host path read-only via OmegaConf interpolation:
0 commit comments