Skip to content

yangcaoai/VGGT-Det-CVPR2026

Repository files navigation

📖 VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection (CVPR 2026)

🔥Please star VGGT-Det ⭐ and share it. Thanks🔥

Yang Cao*, Feize Wu*, Dave Zhenyu Chen, Yingji Zhong, Lanqing Hong, Dan Xu#
The Hong Kong University of Science and Technology
Huawei Sun Yat-Sen University

🚩 Updates

☑ The training and testing codes on Scannet are released.

☑ The pretrained models and training logs are released at here.

☑ The processed ARKitScenes datasets are released at here.

☑ The processed ScanNet datasets are released at here.

☑ The paper is released at Hugging Face and Arxiv.

☑ Our VGGT-Det is accepted by CVPR 2026. The paper and codes will be released soon.

Motivation

Framework

Visualization of Attention-Guided Query Generation

Installation

  • Install mmdetection3d
  • Install torch-scatter: pip install torch-scatter==2.1.2 -f https://data.pyg.org/whl/torch-2.1.0%2Bcu118.html

Dataset preparation

Please download the datasets from here.

Then run for the downloaded *.tar file:

bash data_preparation.sh

Evaluation

Download the pretrained models here. Then run:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash tools/dist_test.sh projects/VGGTDet/config/vggtdet_scannet.py VGGT-Det-Pretrained-Models/ScanNet/epoch_180.pth 8

Training

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash tools/dist_train.sh projects/VGGTDet/config/vggtdet_scannet.py 8 

📜 BibTeX

If VGGT-Det is helpful, please cite:

@inproceedings{cao2026vggtdet,
  title={VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection},
  author={Cao, Yang and Wu, Feize and Dave Chen, Zhenyu and Zhong, Yingji and Hong, Lanqing and Xu, Dan},
  booktitle={CVPR},
  year={2026}
}

📧 Contact

If you have any question, please email yangcao.cs@gmail.com.

📜 Sincere Acknowledgement

Appreciate the following works for their great contributions:

VGGT: Inspire our study for Sensor-Geometry-Free 3DDet.

MVSDet, NeRF-Det and MMDet3D: Serve as the foundation for our codes.

ScanNet and ARKitScenes: Serve as the datasets for training and evaluation.

About

Official code for CVPR 2026 paper: VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages