This is the official implementation of our CVPR paper CARI4D.
Authors: Xianghui Xie, Bowen Wen, Yan Chang, Hesam Rabeti, Jiefeng Li, Ye Yuan, Gerard Pons-Moll, Stan Birchfield
- Feb 28, 2026, code released.
- Dec 16, 2025, ArXiv released.
- Demo on internet video.
- Demo on BEHAVE video.
- Evaluation on BEHAVE dataset.
- Example training.
Environment setup option 1: Docker.
docker pull xiexh20/cari4d && docker tag xiexh20/cari4d cari4d # Or to build from scratch: cd docker/ && docker build --network host -t cari4d . && cd ..
bash docker/run_container.shEnvironment setup option 2: conda (experimental)
conda create -n cari4d python=3.10 -y
conda activate cari4d
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt --no-build-isolation Additional files and model checkpoints:
# Clone unidepth and keep only the submodule
git clone https://github.com/lpiccinelli-eth/UniDepth.git && mv UniDepth/unidepth . && rm -rf UniDepth
# clone VolumetricSMPL and apply patches for SMPLH
git clone https://github.com/markomih/VolumetricSMPL.git && cd VolumetricSMPL && git apply ../scripts/volumetric_smplh.patch && find . -maxdepth 1 -type f -delete && mv VolumetricSMPL/*.py . && rm -r VolumetricSMPL && cd ../
# Clone NLF model weights
mkdir -p weights && wget -O weights/nlf_l_multi_0.3.2.torchscript https://github.com/isarandi/nlf/releases/download/v0.3.2/nlf_l_multi_0.3.2.torchscriptDownload FoundationPosee model weights and place under the folder weights/.
Step 1: Demo data. Download the demo data from here and place them inside data: unzip cari4d-demo.zip -d data.
Step 2: SMPL-H model files.
Follow the instructions from the website, download the SMPLH pickle files and place them under data/smpl/smplh. It should look like this:
data/smpl
├── kid_template.npy
├── smplh
│ ├── SMPLH_female.pkl
│ ├── SMPLH_male.pklThe kid_template.npy comes from AGORA project, smpl_kid_template.npy.
Step 3: model checkpoint. Download pretrained CoCoNet checkpoint and extract to experiments: you should have file experiments/cari4d-release/step031397.pth in current folder.
We provide the pre-processed data in the cari4d-demo.zip file, which corresponds to this youtube video. Please download the original video your self using tools like this. And rename the downloaded file and place to path data/cari4d-demo/wild/videos/Date03_Sub01_gas_wild002.0.color.mp4. The video should have resolution of 608x1080 to be compatible with our pre-processed data. You can then run our demo with:
bash scripts/demo-wild.shThe BEHAVE demo data is self-contained, you can run directly with:
bash scripts/demo.shAfter running the demo BEHAVE data, you can use this command to evaluate the reconstruction:
python tools/eval_normalize.py split_file=splits/demo-behave.json result_dir=output/opt/cari4d-release+step031397_demo-hy3d3-optv2Note that you need to download the packed GT files from here and extract them into output/gt/*.pth for the evaluation.
Please see this doc for detailed step by step instructions.
We provide additional files to support easy reproduction of the results on BEHAVE test sequences:
- Reconstructed object meshes, download here and place them under
data/cari4d-demo/meshes. - Openpose predictions, download here and place them under
data/cari4d-demo/behave/packed. splits/selected-views-map.jsonprovides the camera view of each sequence we used to report test performance.
We thank Yu-Wei Chao, Umar Iqbal, Chenran Li, Yixiao Wang, Daniel Zou for the helpful discussion during the project and John Welsh for the help in code release. This project is built on top of these amazing research projects:
- UniDepth for metri-scale depth estimation.
- NLF and GENMO for human pose estimation.
- Hunyuan3D for object mesh reconstruction.
- FoundationPose for object pose estimation and tracking.
@inproceedings{xie2026cari4d,
title = {CARI4D: Category Agnostic 4D Reconstruction of Human-Object Interaction},
author = {Xie, Xianghui and Wen, Bowen and Chang, Yan and Rabeti, Hesam and Li, Jiefeng and Yuan, Ye and Pons-Moll, Gerard and Birchfield, Stan},
booktitle = {Conference on Computer Vision and Pattern Recognition ({CVPR})},
month = {June},
year = {2026},
}