WASABI with Solo

This repository provides the Wasserstein Adversarial Behavior Imitation (WASABI) algorithm that enables Solo to acquire agile skills through adversarial imitation from rough, partial demonstrations using NVIDIA Isaac Gym.

Paper: Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations
Project website: https://sites.google.com/view/corl2022-wasabi/home

Maintainer: Chenhao Li
Affiliation: Autonomous Learning Group, Max Planck Institute for Intelligent Systems, and Robotic Systems Lab, ETH Zurich
Contact: chenhli@ethz.ch

Installation

Create a new python virtual environment with python 3.8

Install pytorch 1.10 with cuda-11.3

 pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

Install Isaac Gym
- Download and install Isaac Gym Preview 4
```
cd isaacgym/python
pip install -e .
```
- Try running an example
```
cd examples
python 1080_balls_of_solitude.py
```
- For troubleshooting, check docs in isaacgym/docs/index.html

Install solo_gym

 git clone https://github.com/martius-lab/wasabi.git
 cd solo_gym
 pip install -e .

Configuration

The Solo environment is defined by an env file solo8.py and a config file solo8_config.py under solo_gym/envs/solo8/. The config file sets both the environment parameters in class Solo8FlatCfg and the training parameters in class Solo8FlatCfgPPO.
The provided code examplifies the training of Solo 8 with handheld wave motions. 20 recorded demonstrations are augmented with perturbations to 1000 trajectoires with 130 frames and stored in resources/robots/solo8/datasets/motion_data.pt. The state dimension indices are specified in reference_state_idx_dict.json. To train with other demonstrations, replace motion_data.pt and adapt reward functions defined in solo_gym/envs/solo8/solo8.py accordingly.

Usage

Train

python scripts/train.py --task solo8

The trained policy is saved in logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt, where <experiment_name> and <run_name> are defined in the train config.
To disable rendering, append --headless.

Play a trained policy

python scripts/play.py

By default the loaded policy is the last model of the last run of the experiment folder.
Other runs/model iteration can be selected by setting load_run and checkpoint in the train config.
Use u and j to command the forward velocity.

Citation

@inproceedings{li2023learning,
  title={Learning agile skills via adversarial imitation of rough partial demonstrations},
  author={Li, Chenhao and Vlastelica, Marin and Blaes, Sebastian and Frey, Jonas and Grimminger, Felix and Martius, Georg},
  booktitle={Conference on Robot Learning},
  pages={342--352},
  year={2023},
  organization={PMLR}
}

References

The code is built upon the open-sourced Isaac Gym Environments for Legged Robots and the PPO implementation. We refer to the original repositories for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
learning		learning
resources/robots/solo8		resources/robots/solo8
scripts		scripts
solo_gym		solo_gym
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
wave.png		wave.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WASABI with Solo

Installation

Configuration

Usage

Train

Play a trained policy

Citation

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

martius-lab/wasabi

Folders and files

Latest commit

History

Repository files navigation

WASABI with Solo

Installation

Configuration

Usage

Train

Play a trained policy

Citation

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages