Skip to content

Commit 78692be

Browse files
authored
Adds dexterous cube manipulation environment for Allegro hand (#499)
# Description This MR adds the dexterous cube manipulation environment from IsaacGymEnvs. The implementation is mostly based on the standard AllegroHand environment. However, it includes the following components from the Dextreme work: * Randomization of mass, joint PD gains, friction, and initial state distribution * Tuning of RL-Games from Dextreme work * Exponential moving average (bounded joint position) action term However, it does the following differently since it led to better convergence: * Changes the way out-of-reach termination is computed. Original work seems to do it w.r.t. the goal position, but that seems unnecessary * Removed goal position racking reward. It was tuned too high, which made learning difficult and is not needed ## Type of change - New feature (non-breaking change which adds functionality) ## Screenshots Trained with RSL-RL https://github.com/isaac-orbit/orbit/assets/12863862/8f51e468-2d93-4520-9689-f1e8f1a898e6 ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./orbit.sh --format` - [x] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have run all the tests with `./orbit.sh --test` and they pass - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
1 parent b178171 commit 78692be

File tree

21 files changed

+1284
-3
lines changed

21 files changed

+1284
-3
lines changed
Loading

docs/source/features/environments.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -66,19 +66,22 @@ for the reach environment:
6666
+----------------+---------------------+-----------------------------------------------------------------------------+
6767
| |cabi-franka| | |cabi-franka-link| | Grasp the handle of a cabinet's drawer and open it with the Franka robot |
6868
+----------------+---------------------+-----------------------------------------------------------------------------+
69+
| |cube-allegro| | |cube-allegro-link| | In-hand reorientation of a cube using Allegro hand |
70+
+----------------+---------------------+-----------------------------------------------------------------------------+
6971

7072
.. |reach-franka| image:: ../_static/tasks/manipulation/franka_reach.jpg
7173
.. |reach-ur10| image:: ../_static/tasks/manipulation/ur10_reach.jpg
7274
.. |lift-cube| image:: ../_static/tasks/manipulation/franka_lift.jpg
7375
.. |cabi-franka| image:: ../_static/tasks/manipulation/franka_open_drawer.jpg
76+
.. |cube-allegro| image:: ../_static/tasks/manipulation/allegro_cube.jpg
7477

7578
.. |reach-franka-link| replace:: `Isaac-Reach-Franka-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/reach/config/franka/joint_pos_env_cfg.py>`__
7679
.. |reach-ur10-link| replace:: `Isaac-Reach-UR10-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/reach/config/ur_10/joint_pos_env_cfg.py>`__
7780
.. |lift-cube-link| replace:: `Isaac-Lift-Cube-Franka-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/config/franka/joint_pos_env_cfg.py>`__
7881
.. |lift-cube-ik-abs-link| replace:: `Isaac-Lift-Cube-Franka-IK-Abs-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/config/franka/ik_abs_env_cfg.py>`__
7982
.. |lift-cube-ik-rel-link| replace:: `Isaac-Lift-Cube-Franka-IK-Rel-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/config/franka/ik_rel_env_cfg.py>`__
8083
.. |cabi-franka-link| replace:: `Isaac-Open-Drawer-Franka-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/cabinet/config/franka/joint_pos_env_cfg.py>`__
81-
84+
.. |cube-allegro-link| replace:: `Isaac-Repose-Cube-Allegro-v0 <https://github.com/NVIDIA-Omniverse/orbit/blob/main/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/inhand/config/allegro/allegro_env_cfg.py>`__
8285

8386
Locomotion
8487
----------

source/extensions/omni.isaac.orbit_tasks/config/extension.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22

33
# Note: Semantic Versioning is used: https://semver.org/
4-
version = "0.6.0"
4+
version = "0.6.1"
55

66
# Description
77
title = "ORBIT Environments"

source/extensions/omni.isaac.orbit_tasks/docs/CHANGELOG.rst

+10
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,16 @@
11
Changelog
22
---------
33

4+
0.6.1 (2024-04-16)
5+
~~~~~~~~~~~~~~~~~~
6+
7+
Added
8+
^^^^^
9+
10+
* Added a new environment ``Isaac-Repose-Cube-Allegro-v0`` and ``Isaac-Repose-Allegro-Cube-NoVelObs-v0``
11+
for the Allegro hand to reorient a cube. It is based on the IsaacGymEnvs Allegro hand environment.
12+
13+
414
0.6.0 (2024-03-10)
515
~~~~~~~~~~~~~~~~~~
616

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Copyright (c) 2022-2024, The ORBIT Project Developers.
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
"""In-hand object reorientation environment.
7+
8+
These environments are based on the `dexterous cube manipulation`_ environments
9+
provided in IsaacGymEnvs repository from NVIDIA. However, they contain certain
10+
modifications and additional features.
11+
12+
.. _dexterous cube manipulation: https://github.com/NVIDIA-Omniverse/IsaacGymEnvs/blob/main/isaacgymenvs/tasks/allegro_hand.py
13+
14+
"""
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Copyright (c) 2022-2024, The ORBIT Project Developers.
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
"""Configurations for in-hand manipulation environments."""
7+
8+
# We leave this file empty since we don't want to expose any configs in this package directly.
9+
# We still need this file to import the "config" module in the parent package.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Copyright (c) 2022-2024, The ORBIT Project Developers.
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
import gymnasium as gym
7+
8+
from . import agents, allegro_env_cfg
9+
10+
##
11+
# Register Gym environments.
12+
##
13+
14+
##
15+
# Full kinematic state observations.
16+
##
17+
18+
gym.register(
19+
id="Isaac-Repose-Cube-Allegro-v0",
20+
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
21+
disable_env_checker=True,
22+
kwargs={
23+
"env_cfg_entry_point": allegro_env_cfg.AllegroCubeEnvCfg,
24+
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AllegroCubePPORunnerCfg,
25+
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
26+
},
27+
)
28+
29+
gym.register(
30+
id="Isaac-Repose-Cube-Allegro-Play-v0",
31+
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
32+
disable_env_checker=True,
33+
kwargs={
34+
"env_cfg_entry_point": allegro_env_cfg.AllegroCubeEnvCfg_PLAY,
35+
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AllegroCubePPORunnerCfg,
36+
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
37+
},
38+
)
39+
40+
##
41+
# Kinematic state observations without velocity information.
42+
##
43+
44+
gym.register(
45+
id="Isaac-Repose-Cube-Allegro-NoVelObs-v0",
46+
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
47+
disable_env_checker=True,
48+
kwargs={
49+
"env_cfg_entry_point": allegro_env_cfg.AllegroCubeNoVelObsEnvCfg,
50+
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AllegroCubeNoVelObsPPORunnerCfg,
51+
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
52+
},
53+
)
54+
55+
gym.register(
56+
id="Isaac-Repose-Cube-Allegro-NoVelObs-Play-v0",
57+
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
58+
disable_env_checker=True,
59+
kwargs={
60+
"env_cfg_entry_point": allegro_env_cfg.AllegroCubeNoVelObsEnvCfg_PLAY,
61+
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AllegroCubeNoVelObsPPORunnerCfg,
62+
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
63+
},
64+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Copyright (c) 2022-2024, The ORBIT Project Developers.
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
from . import rsl_rl_cfg # noqa: F401, F403
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
params:
2+
seed: 42
3+
4+
# environment wrapper clipping
5+
env:
6+
clip_observations: 5.0
7+
clip_actions: 1.0
8+
9+
algo:
10+
name: a2c_continuous
11+
12+
model:
13+
name: continuous_a2c_logstd
14+
15+
network:
16+
name: actor_critic
17+
separate: False
18+
19+
space:
20+
continuous:
21+
mu_activation: None
22+
sigma_activation: None
23+
mu_init:
24+
name: default
25+
sigma_init:
26+
name: const_initializer
27+
val: 0
28+
fixed_sigma: True
29+
30+
mlp:
31+
units: [512, 256, 128]
32+
activation: elu
33+
d2rl: False
34+
35+
initializer:
36+
name: default
37+
regularizer:
38+
name: None
39+
40+
load_checkpoint: False
41+
load_path: ''
42+
43+
config:
44+
name: allegro_cube
45+
env_name: rlgpu
46+
device: 'cuda:0'
47+
device_name: 'cuda:0'
48+
multi_gpu: False
49+
ppo: True
50+
mixed_precision: False
51+
normalize_input: True
52+
normalize_value: True
53+
value_bootstrap: True
54+
num_actors: -1 # configured from the script (based on num_envs)
55+
reward_shaper:
56+
scale_value: 0.1
57+
normalize_advantage: True
58+
gamma: 0.998
59+
tau: 0.95
60+
learning_rate: 5e-4
61+
lr_schedule: adaptive
62+
schedule_type: standard
63+
kl_threshold: 0.016
64+
score_to_win: 100000
65+
max_epochs: 5000
66+
save_best_after: 500
67+
save_frequency: 200
68+
print_stats: True
69+
grad_norm: 1.0
70+
entropy_coef: 0.002
71+
truncate_grads: True
72+
e_clip: 0.2
73+
horizon_length: 24
74+
minibatch_size: 16384 # 32768
75+
mini_epochs: 5
76+
critic_coef: 4
77+
clip_value: True
78+
seq_length: 4
79+
bounds_loss_coef: 0.0005
80+
81+
player:
82+
#render: True
83+
deterministic: True
84+
games_num: 100000
85+
print_stats: True
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Copyright (c) 2022-2024, The ORBIT Project Developers.
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
from omni.isaac.orbit.utils import configclass
7+
8+
from omni.isaac.orbit_tasks.utils.wrappers.rsl_rl import (
9+
RslRlOnPolicyRunnerCfg,
10+
RslRlPpoActorCriticCfg,
11+
RslRlPpoAlgorithmCfg,
12+
)
13+
14+
15+
@configclass
16+
class AllegroCubePPORunnerCfg(RslRlOnPolicyRunnerCfg):
17+
num_steps_per_env = 24
18+
max_iterations = 5000
19+
save_interval = 50
20+
experiment_name = "allegro_cube"
21+
empirical_normalization = True
22+
policy = RslRlPpoActorCriticCfg(
23+
init_noise_std=1.0,
24+
actor_hidden_dims=[512, 256, 128],
25+
critic_hidden_dims=[512, 256, 128],
26+
activation="elu",
27+
)
28+
algorithm = RslRlPpoAlgorithmCfg(
29+
value_loss_coef=1.0,
30+
use_clipped_value_loss=True,
31+
clip_param=0.2,
32+
entropy_coef=0.002,
33+
num_learning_epochs=5,
34+
num_mini_batches=4,
35+
learning_rate=0.001,
36+
schedule="adaptive",
37+
gamma=0.998,
38+
lam=0.95,
39+
desired_kl=0.01,
40+
max_grad_norm=1.0,
41+
)
42+
43+
44+
@configclass
45+
class AllegroCubeNoVelObsPPORunnerCfg(AllegroCubePPORunnerCfg):
46+
experiment_name = "allegro_cube_no_vel_obs"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Copyright (c) 2022-2024, The ORBIT Project Developers.
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
from omni.isaac.orbit.utils import configclass
7+
8+
import omni.isaac.orbit_tasks.manipulation.inhand.inhand_env_cfg as inhand_env_cfg
9+
10+
##
11+
# Pre-defined configs
12+
##
13+
from omni.isaac.orbit_assets import ALLEGRO_HAND_CFG # isort: skip
14+
15+
16+
@configclass
17+
class AllegroCubeEnvCfg(inhand_env_cfg.InHandObjectEnvCfg):
18+
def __post_init__(self):
19+
# post init of parent
20+
super().__post_init__()
21+
22+
# switch robot to allegro hand
23+
self.scene.robot = ALLEGRO_HAND_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
24+
25+
26+
@configclass
27+
class AllegroCubeEnvCfg_PLAY(AllegroCubeEnvCfg):
28+
def __post_init__(self):
29+
# post init of parent
30+
super().__post_init__()
31+
# make a smaller scene for play
32+
self.scene.num_envs = 50
33+
# disable randomization for play
34+
self.observations.policy.enable_corruption = False
35+
# remove termination due to timeouts
36+
self.terminations.time_out = None
37+
38+
39+
##
40+
# Environment configuration with no velocity observations.
41+
##
42+
43+
44+
@configclass
45+
class AllegroCubeNoVelObsEnvCfg(AllegroCubeEnvCfg):
46+
def __post_init__(self):
47+
# post init of parent
48+
super().__post_init__()
49+
50+
# switch observation group to no velocity group
51+
self.observations.policy = inhand_env_cfg.ObservationsCfg.NoVelocityKinematicObsGroupCfg()
52+
53+
54+
@configclass
55+
class AllegroCubeNoVelObsEnvCfg_PLAY(AllegroCubeNoVelObsEnvCfg):
56+
def __post_init__(self):
57+
# post init of parent
58+
super().__post_init__()
59+
# make a smaller scene for play
60+
self.scene.num_envs = 50
61+
# disable randomization for play
62+
self.observations.policy.enable_corruption = False
63+
# remove termination due to timeouts
64+
self.terminations.time_out = None

0 commit comments

Comments
 (0)