GitHub - facebookresearch/univlg: Unifying 2D and 3D Vision-Language Understanding

Unifying 2D and 3D Vision-Language Understanding

Ayush Jain^*1,2 Alexander Swerdlow^*1 Yuzhou Wang¹ Sergio Arnaud² Ada Martin² Alexander Sax² Franziska Meier² Katerina Fragkiadaki¹

¹ Carnegie Mellon University ² Meta AI

Project Updates

News: 2025/02/25: We achieved 1st place on the ScanRefer localization leaderboard!

Hugging Face models

The UniVLG checkpoints are available on Hugging Face.

Getting Started

To install the dependencies, see docs/INSTALL.md.

For instructions on how to download and pre-process the data, see docs/DATA.md.

Checkpoints

mkdir ckpts
uvx --with hf_transfer --from huggingface_hub huggingface-cli download katefgroup/UniVLG --include "univlg.pth" --local-dir ckpts

To download the 3D-only model, replace univlg.pth with univlg_3d_only.pth in the command above. Alternatively, to download all checkpoints, run:

uvx --with hf_transfer --from huggingface_hub huggingface-cli download katefgroup/UniVLG --local-dir ckpts

Training and Evaluation

See docs/RUN.md for training and evaluation commands.

Citation

To cite our work, please use the following:

@article{jain2025unifying,
  title={Unifying 2D and 3D Vision-Language Understanding},
  author={Jain, Ayush and Swerdlow, Alexander and Wang, Yuzhou and Arnaud, Sergio and Martin, Ada and Sax, Alexander and Meier, Franziska and Fragkiadaki, Katerina},
  journal={arXiv preprint arXiv:2503.10745},
  year={2025}
}

Credits

Notice

The majority of UniVLG is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Odin is licensed under the MIT license. Pointcept is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data_preparation		data_preparation
docs		docs
libs		libs
scripts		scripts
tools		tools
univlg		univlg
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Unifying 2D and 3D Vision-Language Understanding

Project Updates

Hugging Face models

Getting Started

Checkpoints

Training and Evaluation

Citation

Credits

Notice

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

License

facebookresearch/univlg

Folders and files

Latest commit

History

Repository files navigation

Unifying 2D and 3D Vision-Language Understanding

Project Updates

Hugging Face models

Getting Started

Checkpoints

Training and Evaluation

Citation

Credits

Notice

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages