Continuous Imitation Learning from Observation (CILO)

Pytorch official implementation for Continuous Imitation Learning from Observation from Explorative Imitation Learning: A Path Signature Approach for Continuous Environments (ECAI).

Requirements

Python: 3.9.15
Conda: 23.1.0

Installing dependencies

There is a script in ./dependencies/install.sh, which will create a conda environment and install all dependencies needed to run this repository.

Running

To run CILO you need to first create random transition samples ($I^{pre}$) and expert samples ($\mathcal{T}^e$).

Creating random samples:

To create random samples for one environment:

python create_random_mujoco.py --env_name <ENV> --data_path <PATH>

For example:

python create_random_mujoco.py --env_name Ant-v3 --data_path ./dataset/ant/random_ant

To create random samples for all environments:

bash ./scripts/create_randoms.sh

Creating expert samples:

To create expert samples for one environment:

python create_dataset_mujoco.py -t <THREADS> -e <EPISODES> -g <ENV> --mode <play|collate|all>

For example:

python create_dataset_mujoco.py -t 4 -e 10 -g ant --mode all

To create expert samples for all environments:

bash ./scripts/create_experts.sh

Using samples from paper:

If you want to use the same datasets from the paper they are all publicly available via IL-Datasets. All datasets are listed on HuggingFace and can be downloaded using BaselineDataset from IL-Datasets. To use the dataset:

from imitation_datasets.dataset import BaselineDataset

dataset = BaselineDataset("NathanGavenski/Ant-v2", source="huggingface")

Running CILO

To run CILO you can run the command:

clear && python train_cilo.py \
--gpu <GPU> \
--encoder vector \
--env_name <ENV> \
--run_name <RUN NAME> \
--data_path <RANDOM> \
--expert_path <EXPERT> \
--alpha <ALPHA> \
--domain vector \
--choice explore \
\
--lr <Dynamics LR> \
--lr_decay_rate <LR DECAY> \
--batch_size <BATCH SIZE> \
--idm_epochs <EPOCHS> \
\
--policy_lr <Policy LR> \
--policy_lr_decay_rate <LR DECAY> \
--policy_batch_size <BATCH SIZE> \
\
--verbose

where <GPU> should be -1 if there are no GPUs available, <RANDOM> is the path for the random samples, <EXPERT> is the path for the expert samples, and the <RUN NAME> is the name you want for you experiment in the tensorboard.

For simplicity, we provide a script for each environment with all hyperparameters used during training. To use them:

bash ./scripts/cilo/cilo_ant.sh -1 experiment1

where the first argument is the gpu number and the second is the experiment name.

Ciation

@incollection{gavenski2024explorative,
	title={Explorative Imitation Learning: A Path Signature Approach for Continuous Environments},
	author={Gavenski, Nathan and Monteiro, Juarez and Meneguzzi, Felipe and Luck, Michael and Rodrigues, Odinaldo},
	booktitle={ECAI 2024},
	pages={1551-1558}
	year={2024},
	publisher={IOS Press}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Models		Models
dataset		dataset
datasets		datasets
dependencies		dependencies
scripts		scripts
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
create_dataset_mujoco.py		create_dataset_mujoco.py
create_random_mujoco.py		create_random_mujoco.py
eval_cilo.py		eval_cilo.py
train_cilo.py		train_cilo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Continuous Imitation Learning from Observation (CILO)

Requirements

Installing dependencies

Running

Creating random samples:

Creating expert samples:

Using samples from paper:

Running CILO

Ciation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

NathanGavenski/CILO

Folders and files

Latest commit

History

Repository files navigation

Continuous Imitation Learning from Observation (CILO)

Requirements

Installing dependencies

Running

Creating random samples:

Creating expert samples:

Using samples from paper:

Running CILO

Ciation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages