Create the conda environment and install all of the dependencies. Mamba is recommended for faster installation:
conda create -n ovon python=3.8 cmake=3.14.0 -y
conda activate ovon
conda install -n ovon habitat-sim=0.2.3 headless -c conda-forge -c aihabitat -y
conda install -n ovon pytorch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 pytorch-cuda=11.8 cudatoolkit=11.8 -c pytorch -c nvidia -c conda-forge -y
conda install -c conda-forge gxx=9
conda install nvidia/label/cuda-11.8.0::cuda-nvcc
conda install -c nvidia cuda-toolkit==11.8
export CUDA_HOME=<PATH_TO_YOUR_CONDA_ENV>
pip install -e .
# Install distributed_dagger and frontier_exploration (From original DagRL codebase: https://github.com/naokiyokoyama/frontier_exploration/tree/a8890d68cfa0d10254238abe9266a76856cb1f17)
cd frontier_exploration && pip install -e . && cd ..
# Install habitat-lab
cd habitat-lab
pip install -e habitat-lab
pip install -e habitat-baselines
pip install ftfy regex tqdm GPUtil trimesh seaborn timm scikit-learn einops transformers
pip install git+https://github.com/openai/CLIP.git
# Install semantic input packages:
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
pip install ultralytics
# Patch modeling_llama.py of transformers:
cp modeling_llama.py <<PATH_TO_YOUR_CONDA_ENV>/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py
First, set the following variables during installation (don't need to put in .bashrc):
MATTERPORT_TOKEN_ID=<FILL IN FROM YOUR ACCOUNT INFO IN MATTERPORT>
MATTERPORT_TOKEN_SECRET=<FILL IN FROM YOUR ACCOUNT INFO IN MATTERPORT>
DATA_DIR=</path/to/ovon/data># Download HM3D 3D scans (scenes_dataset)
python -m habitat_sim.utils.datasets_download \
--username $MATTERPORT_TOKEN_ID --password $MATTERPORT_TOKEN_SECRET \
--uids hm3d_train_v0.2 \
--data-path $DATA_DIR &&
python -m habitat_sim.utils.datasets_download \
--username $MATTERPORT_TOKEN_ID --password $MATTERPORT_TOKEN_SECRET \
--uids hm3d_val_v0.2 \
--data-path $DATA_DIRThe OVON navigation episodes can be found here: https://huggingface.co/datasets/nyokoyama/hm3d_ovon/
The tar.gz file should be decompressed in data/datasets/ovon/, such that the hm3d directory is located at data/datasets/ovon/hm3d/. Delete all "._*" files that appear after decompressing. "val_unseen_easy.json.gz" should be renamed to "val_seen_synonyms.json.gz"
We provide pre-trained checkpoint on the HuggingFace Hub:
ckpt.2.pth: https://huggingface.co/wingrune/OVSegDT
Run the following to evaluate:
python -m ovon.run \
--run-type eval \
--exp-config config/experiments/transformer_rl_segm_loss-validation.yaml \
habitat_baselines.eval_ckpt_path_dir=<path_to_ckpt>By default the evaluation will be performed using YOLOE model on val unseen split of HM3D-OVON. To run experiments on val_seen or val_seen_synonyms, change:
eval.split to "val_seen" or "val_seen_synonyms" segmentation_source to "yolo_val_seen" or yolo_val_seen_synonyms"
- Run the following to train with our proposed EALM loss and segmentation loss:
python -m ovon.run --run-type train \
--debug-datapath \
--exp-config config/experiments/transformer_dagger_ppo_segm_loss.yaml- Run the following to train with our proposed EALM loss and without segmentation loss:
python -m ovon.run --run-type train \
--debug-datapath \
--exp-config config/experiments/transformer_dagger_ppo_no_segm_loss.yaml