Skip to content

Code for "Solving New Tasks by Adapting Internet Video Knowledge" (ICLR 2025)

Notifications You must be signed in to change notification settings

brown-palm/adapt2act

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Solving New Tasks by Adapting Internet Video Knowledge

Quick Start

Setup Conda Environment

conda create -n adapt2act python=3.9
conda activate adapt2act

pip install pip==21.0 wheel==0.38.0 setuptools==65.5.0  # specified gym version requires these tools to be old
pip install -r requirements.txt

Install Visual-Cortex

git clone https://github.com/facebookresearch/eai-vc.git
cd eai-vc

pip install -e ./vc_models

Customize Configurations for DeepMind Control Environments

Please follow the same instruction in TADPoLe to customize configurations for Dog and Humanoid environments.

Checkpoints

Please put checkpoints/ under the adapt2act/ folder, and the directory should have the following structure:

adapt2act/
└── checkpoints/
    ├── animatediff_finetuned/
    │   ├── {domain}_finetuned.ckpt
    │   └── ...
    ├── in_domain/
    │   ├── {domain}/
    │   ├── {domain}_suboptimal/
    │   └── ...
    ├── dreambooth/
    │   ├── {domain}_lora/
    │   └── ...
    └── inv_dyn.ckpt

We currently support three domains: Metaworld mw, Humanoid humanoid and Dog dog. The model checkpoints can be downloaded here.

Policy Supervision

Tip

To enable wandb logging, enter your wandb entity in cfgs/default.yaml and add use_wandb=True to the commands below

AnimateDiff

Vanilla AnimateDiff

python src/vidtadpole_train.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    use_dreambooth=False \
    use_finetuned=False

Direct Finetuning

python src/vidtadpole_train.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    use_dreambooth=False \
    use_finetuned=True

Subject Customization

python src/vidtadpole_train.py task="metaworld-door-close" \
    text_prompt="a [D] robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    use_dreambooth=True \
    use_finetuned=False

Probabilistic Adaptation

python src/vidtadpole_train_probadap.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    prior_strength=0.1 \
    inverted_probadap=False

Inverse Probabilistic Adaptation

python src/vidtadpole_train_probadap.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    prior_strength=0.1 \
    inverted_probadap=True

AnimateLCM

Vanilla AnimateLCM

python src/vidtadpole_train_lcm.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    use_dreambooth=False \
    use_finetuned=False

Probabilistic Adaptation

python src/vidtadpole_train_lcm_probadap.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    prior_strength=0.1 \
    inverted_probadap=False

Inverse Probabilistic Adaptation

python src/vidtadpole_train_lcm_probadap.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    prior_strength=0.2 \
    inverted_probadap=True

In-Domain-Only

python src/vidtadpole_train_probadap.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    avdc_prompt="door close" \
    seed=0 \
    prior_strength=0 \
    inverted_probadap=False

Visual Planning

AnimateDiff

Vanilla AnimateDiff

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=7.5 \
    plan_with_probadap=False \
    plan_with_dreambooth=False \
    plan_with_finetuned=False

Direct Finetuning

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=8 \
    plan_with_probadap=False \
    plan_with_dreambooth=False \
    plan_with_finetuned=True

Subject Customization

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a [D] robot arm closing a door" \
    seed=0 \
    guidance_scale=7.5 \
    plan_with_probadap=False \
    plan_with_dreambooth=True \
    plan_with_finetuned=False

Probabilistic Adaptation

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0.1 \
    inverted_probadap=False

Inverse Probabilistic Adaptation

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0.5 \
    inverted_probadap=True

AnimateLCM

Vanilla AnimateLCM

python src/visual_planning_lcm.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=False \
    plan_with_dreambooth=False \
    plan_with_finetuned=False

Probabilistic Adaptation

python src/visual_planning_lcm.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0.1 \
    inverted_probadap=False

Inverse Probabilistic Adaptation

python src/visual_planning_lcm.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0.2 \
    inverted_probadap=True

In-Domain-Only

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0 \
    inverted_probadap=False

Visual Planning with Suboptimal Data

In-Domain-Only

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0 \
    inverted_probadap=False \
    use_suboptimal=True

Probabilistic Adaptation

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0.1 \
    inverted_probadap=False \
    use_suboptimal=True

Inverse Probabilistic Adaptation

python src/visual_planning.py task="metaworld-door-close" \
    text_prompt="a robot arm closing a door" \
    seed=0 \
    guidance_scale=2.5 \
    plan_with_probadap=True \
    plan_with_dreambooth=False \
    plan_with_finetuned=False \
    prior_strength=0.5 \
    inverted_probadap=True \
    use_suboptimal=True

Citation

If you find this repository helpful, please consider citing our work:

@inproceedings{luo2024solving,
  title={Solving New Tasks by Adapting Internet Video Knowledge},
  author={Luo, Calvin and Zeng, Zilai and Du, Yilun and Sun, Chen},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025}
}

Acknowledgement

This repo contains code adapted from flowdiffusion, TDMPC and TADPoLe. We thank the authors and contributors for open-sourcing their code.

About

Code for "Solving New Tasks by Adapting Internet Video Knowledge" (ICLR 2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages