Sirui Xu*
Dongting Li*
Yucheng Zhang*
Xiyan Xu*
Qi Long*
Ziyin Wang*
Yunzhi Lu
Shuchang Dong
Hezi Jiang
Akshat Gupta
Yu-Xiong Wang
Liang-Yan Gui
University of Illinois Urbana Champaign
*Equal contribution
CVPR 2025
- [2025-04-20] Initial release of the InterAct dataset
- Release comprehensive text descriptions, data processing workflows, visualization tools, and usage guidelines
- Publish the paper on arXiv
- Release the evaluation pipeline for the benchmark
- Release the dataset with unified SMPL representation
- Release HOI correction and augmentation data and pipeline
- Release retargeted HOI dataset with unified human shape
- Release baseline constructions for HOI generative tasks
We introduce InterAct, a comprehensive large-scale 3D human-object interaction (HOI) dataset, originally comprising 21.81 hours of HOI data consolidated from diverse sources, the dataset is meticulously refined by correcting contact artifacts and augmented with varied motion patterns to extend the total duration to approximately 30 hours. It includes 34.1K sequence-level detailed text descriptions.
The InterAct dataset is consolidated according to the licenses of its original data sources. For data approved for redistribution, direct download links are provided; for others, we supply processing code to convert the raw data into our standardized format.
Please follow the steps below to download, process, and organize the data.
Please fill out this form to request non-commercial access to InterAct. Once authorized, you'll receive the download links. Organize the data from neuraldome, imhd, and chairs according to the following directory structure.
data
│── neuraldome
│ ├── objects
│ │ └── baseball
│ │ ├── baseball.obj # object mesh
│ │ └── sample_points.npy # sampled object pointcloud
│ └── ...
│ ├── objects_bps
│ │ └── baseball
│ │ └── baseball.npy # static bps representation
│ └── ...
│ ├── sequences
│ │ └── subject01_baseball_0
│ │ ├── action.npy
│ │ ├── action.txt
│ │ ├── human.npz
│ │ ├── markers.npy
│ │ ├── joints.npy
│ │ ├── motion.npy
│ │ ├── object.npz
│ │ └── text.txt
│ └── ...
│ └── sequences_canonical
│ └── subject01_baseball_0
│ ├── action.npy
│ ├── action.txt
│ ├── human.npz
│ ├── markers.npy
│ ├── joints.npy
│ ├── motion.npy
│ ├── object.npz
│ └── text.txt
│ └── ...
│── imhd
│── chairs
└── annotations
The GRAB, BEHAVE, and INTERCAP datasets are available for academic research under custom licenses from the Max Planck Institute for Intelligent Systems. Note that we do not distribute the original motion data—instead, we provide the text labels annotated by our team. To download these datasets, please visit their respective websites and agree to the terms of their licenses:
Please follow these steps to get started:
-
Download SMPL+H, SMPLX, DMPLs.
Download SMPL+H mode from SMPL+H (choose Extended SMPL+H model used in the AMASS project), DMPL model from DMPL (choose DMPLs compatible with SMPL), and SMPL-X model from SMPL-X. Then, please place all the models under
./models/
. The./models/
folder tree should be:models │── smplh │ ├── female │ │ ├── model.npz │ ├── male │ │ ├── model.npz │ ├── neutral │ │ ├── model.npz │ ├── SMPLH_FEMALE.pkl │ └── SMPLH_MALE.pkl └── smplx ├── SMPLX_FEMALE.npz ├── SMPLX_FEMALE.pkl ├── SMPLX_MALE.npz ├── SMPLX_MALE.pkl ├── SMPLX_NEUTRAL.npz └── SMPLX_NEUTRAL.pkl
Please follow smplx tools to merge SMPL-H and MANO parameters.
-
Prepare Environment
-
Option A: From environment.yml
Create the Conda environment:
conda env create -f environment.yml
To install PyTorch3D, please follow the official instructions: Pytorch3D
Install remaining packages:
pip install git+https://github.com/otaheri/chamfer_distance pip install git+https://github.com/otaheri/bps_torch python -m spacy download en_core_web_sm
-
Option B: Manual setup
Create and activate a fresh environment:
conda create -n interact python=3.8 conda activate interact pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
To install PyTorch3D, please follow the official instructions: Pytorch3D.
Install remaining packages:
pip install -r requirements.txt python -m spacy download en_core_web_sm
- Prepare raw data
-
BEHAVE
Download the motion data from this link, and put them into ./data/behave/sequences. Download object data from this link, and put them into ./data/behave/objects.
Expected File Structure:
data/behave/ ├── sequences │ ├── data_name │ ├── object_fit_all.npz # object's pose sequences │ └── smpl_fit_all.npz # human's pose sequences └── objects └── object_name ├── object_name.jpg # one photo of the object ├── object_name.obj # reconstructed 3D scan of the object ├── object_name.obj.mtl # mesh material property ├── object_name_tex.jpg # mesh texture └── object_name_fxxx.ply # simplified object mesh
-
OMOMO
Download the dataset from this link.
Expected File Structure:
data/omomo/raw ├── omomo_text_anno_json_data # Annotation JSON data ├── captured_objects │ └── object_name_cleaned_simplified.obj # Simplified object mesh ├── test_diffusion_manip_seq_joints24.p # Test sequences └── train_diffusion_manip_seq_joints24.p # Train sequences
-
InterCap
Dowload InterCap from the the project website. Please download the one with "new results via newly trained LEMO hand models"
Expected File Structure:
data/intercap/raw └── 01 └── 01 └── Seg_id ├── res.pkl # Human and Object Motion └── Mesh └── 00000_second_obj.ply # Object mesh ...
-
GRAB
Download GRAB from the project website.
Expected File Structure:
data/grab/raw ├── grab │ ├── s1 │ └── seq_name.npz # Human and Object Motion ... └── tool ├── object_meshes # Object mesh ├── object_settings ├── subject_meshes # Subject mesh └── subject_settings
- Data Processing
After organizing the raw data, execute the following steps to process the datasets into our standard representations.
-
Run the processing scripts for each dataset:
python process/process_behave.py python process/process_grab.py python process/process_intercap.py python process/process_omomo.py
-
Canonicalize the object mesh:
python process/canonicalize_obj.py
-
Segment the sequences according to annotations and generate associated text files:
python process/process_text.py python process/process_text_omomo.py
After processing, the directory structure under data/ should include all sub-datasets, including:
data ├── annotation ├── behave │ ├── objects │ │ └── object_name │ │ └── object_name.obj │ └── sequences │ └── id │ ├── human.npz │ ├── object.npz │ └── text.txt ├── omomo │ ├── objects │ │ └── object_name │ │ └── object_name.obj │ └── sequences │ └── id │ ├── human.npz │ ├── object.npz │ └── text.txt ├── intercap │ ├── objects │ │ └── object_name │ │ └── object_name.obj │ └── sequences │ └── id │ ├── human.npz │ ├── object.npz │ └── text.txt └── grab ├── objects │ └── object_name │ └── object_name.obj └── sequences └── id ├── human.npz ├── object.npz └── text.txt
-
Canonicalize the human data by running:
python process/canonicalize_human.py # or multi_thread for speedup python process/canonicalize_human_multi_thread.py
-
Sample object keypoints:
python process/sample_obj.py
-
Extract motion representations:
python process/motion_representation.py
-
Process the object bps for training:
python process/process_bps.py
To load and explore our data, please refer to the demo notebook.
To visualize the dataset, execute the following steps:
-
Run the visualization script:
python visualization/visualize.py [dataset_name]
Replace [dataset_name] with one of the following: behave, neuraldome, intercap, omomo, grab, imhd, chairs.
-
To visualize markers, run:
python visualization/visualize_markers.py
If you find this repository useful for your work, please cite:
@inproceedings{xu2025interact,
title = {{InterAct}: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation},
author = {Xu, Sirui and Li, Dongting and Zhang, Yucheng and Xu, Xiyan and Long, Qi and Wang, Ziyin and Lu, Yunzhi and Dong, Shuchang and Jiang, Hezi and Gupta, Akshat and Wang, Yu-Xiong and Gui, Liang-Yan},
booktitle = {CVPR},
year = {2025},
}
Please also consider citing the specific sub-dataset you used from InterAct as follows:
@inproceedings{taheri2020grab,
title = {{GRAB}: A Dataset of Whole-Body Human Grasping of Objects},
author = {Taheri, Omid and Ghorbani, Nima and Black, Michael J. and Tzionas, Dimitrios},
booktitle = {ECCV},
year = {2020},
}
@inproceedings{brahmbhatt2019contactdb,
title = {{ContactDB}: Analyzing and Predicting Grasp Contact via Thermal Imaging},
author = {Brahmbhatt, Samarth and Ham, Cusuh and Kemp, Charles C. and Hays, James},
booktitle = {CVPR},
year = {2019},
}
@inproceedings{bhatnagar2022behave,
title = {{BEHAVE}: Dataset and Method for Tracking Human Object Interactions},
author = {Bhatnagar, Bharat Lal and Xie, Xianghui and Petrov, Ilya and Sminchisescu, Cristian and Theobalt, Christian and Pons-Moll, Gerard},
booktitle = {CVPR},
year = {2022},
}
@article{huang2024intercap,
title = {{InterCap}: Joint Markerless {3D} Tracking of Humans and Objects in Interaction from Multi-view {RGB-D} Images},
author = {Huang, Yinghao and Taheri, Omid and Black, Michael J. and Tzionas, Dimitrios},
journal = {IJCV},
year = {2024}
}
@inproceedings{huang2022intercap,
title = {{InterCap}: {J}oint Markerless {3D} Tracking of Humans and Objects in Interaction},
author = {Huang, Yinghao and Taheri, Omid and Black, Michael J. and Tzionas, Dimitrios},
booktitle = {GCPR},
year = {2022},
}
@inproceedings{jiang2023full,
title = {Full-body articulated human-object interaction},
author = {Jiang, Nan and Liu, Tengyu and Cao, Zhexuan and Cui, Jieming and Zhang, Zhiyuan and Chen, Yixin and Wang, He and Zhu, Yixin and Huang, Siyuan},
booktitle = {ICCV},
year = {2023}
}
@inproceedings{zhang2023neuraldome,
title = {{NeuralDome}: A Neural Modeling Pipeline on Multi-View Human-Object Interactions},
author = {Juze Zhang and Haimin Luo and Hongdi Yang and Xinru Xu and Qianyang Wu and Ye Shi and Jingyi Yu and Lan Xu and Jingya Wang},
booktitle = {CVPR},
year = {2023},
}
@article{li2023object,
title = {Object Motion Guided Human Motion Synthesis},
author = {Li, Jiaman and Wu, Jiajun and Liu, C Karen},
journal = {ACM Trans. Graph.},
year = {2023}
}
@inproceedings{zhao2024imhoi,
author = {Zhao, Chengfeng and Zhang, Juze and Du, Jiashen and Shan, Ziwei and Wang, Junye and Yu, Jingyi and Wang, Jingya and Xu, Lan},
title = {{I'M HOI}: Inertia-aware Monocular Capture of 3D Human-Object Interactions},
booktitle = {CVPR},
year = {2024},
}