-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
According to the official MetaWorld documentation (https://metaworld.farama.org/benchmark/action_space/), the action space is defined as Box(-1.0, 1.0, (4,), float32), representing normalized Cartesian displacements (dx, dy, dz) and gripper control.
However, when I checked the dataset statistics at:
https://huggingface.co/datasets/lerobot/metaworld_mt50/resolve/main/meta/stats.json
I found that the action ranges are:
min: [-11.47, -14.93, -17.28, -1.0]
max: [19.46, 12.47, 24.63, 1.0]
The first three dimensions (dx, dy, dz) appear to be well outside the expected [-1, 1] range, while the gripper dimension remains within [-1, 1].
What do the action values in the dataset represent?
How should these actions be converted when deploying a trained policy back to the MetaWorld environment?