Skip to content

[Task]: Benchmark CLI Integration #166

@samet-akcay

Description

@samet-akcay

Description

Add CLI command for running benchmarks from configuration files. Integrates with jsonargparse for YAML-based composition.

Component

Library (VLA framework)

Acceptance Criteria

  • uv run geti-action benchmark --config <path> command works
  • Loads policy, gyms, and evaluation settings from YAML
  • Supports checkpoint path or HuggingFace model ID
  • Outputs results to specified directory
  • Progress bar during evaluation
  • Example YAML configs for LIBERO-10

Implementation Notes

# configs/benchmark/libero_eval.yaml
benchmark:
  task_suite: libero_10
  num_episodes: 20
  max_steps: 300

policy:
  checkpoint: ./checkpoints/act_libero.ckpt
  # OR: hub_id: open-edge-platform/act-libero-10

evaluation:
  video_dir: ./results/videos
  record_mode: failures
  output_dir: ./results

seed: 42
# CLI usage
uv run geti-action benchmark --config configs/benchmark/libero_eval.yaml

Files:

  • library/src/getiaction/cli/benchmark.py (or extend existing CLI)
  • library/configs/benchmark/libero_eval.yaml
  • library/configs/benchmark/libero_quick.yaml
  • library/tests/integration/test_benchmark_cli.py

Parent Epic

#161

Dependencies

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions