🔥 AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

[Paper] [Project Page] [Jittor Version] [Demo]

🚩 Todo List

📚 BibTeX

If you have any questions about our AR-1-to-3, feel free to contact us via [email protected].
If our work is helpful to you or gives you some inspiration, please star this project and cite our paper. Thank you!

@inproceedings{zhang2025ar,
  title={AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction},
  author={Zhang, Xuying and Zhou, Yupeng and Wang, Kai and Wang, Yikai and Li, Zhen and Jiao, Shaohui and Zhou, Daquan and Hou, Qibin and Cheng, Ming-Ming},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={26273--26283},
  year={2025}
}

⚙️ Setup

1. Dependencies and Installation

We recommend using Python>=3.10, PyTorch>=2.1.0, and CUDA>=12.1.

conda create --name ar123 python=3.10
conda activate ar123
pip install -U pip

# Ensure Ninja is installed
conda install Ninja

# Install the correct version of CUDA
conda install cuda -c nvidia/label/cuda-12.1.0

# Install PyTorch and xformers
# You may need to install another xformers version if you use a different PyTorch version
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
pip install xformers==0.0.22.post7

# For Linux users: Install Triton 
pip install triton

# Install other requirements
pip install -r requirements.txt

2. Downloading Datasets

We provide our rendered objaverse subset under the Zero123++ configuration to facilitate reproducibility and further research. Please download and place it into zero123plus_renders.

😃😃😃 We render and assemble this dataset based on the Blender software. For the beginners not familiar with Blender, we also provide mesh rendering codes that can run automatically on the cmd. Please refer to the render README for more details.

3. Downloading Checkpoints

Download checkpoints and put them into ckpts.

⚡ Quick Start

1. Multi-View Synthesis

To synthesize multiple new perspective images based on a single-view image, please run:

CUDA_VISIBLE_DEVICES=0 python run.py --base configs/ar123_infer.yaml --input_path examples/c912d471c4714ca29ed7cf40bc5b1717.png --mode itomv

2. MV-to-3D Generation

To generate 3D asset based on the synthesized multiple new views, please run:

CUDA_VISIBLE_DEVICES=0 python run.py --base configs/ar123_infer.yaml --input_path examples/c912d471c4714ca29ed7cf40bc5b1717.png --mode mvto3d

3. Image-to-3D Generation

You can also directly obtain 3D asset based on a single-view image by running:

CUDA_VISIBLE_DEVICES=0 python run.py --base configs/ar123_infer.yaml --input_path examples/c912d471c4714ca29ed7cf40bc5b1717.png --mode ito3d

🚀 Training

To train the default model, please run:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py \
    --base configs/ar123_train.yaml \
    --gpus 0,1,2,3,4,5,6,7 \
    --num_nodes 1

参数说明：

--base: Path to configuration file
--gpus: GPU device ID in use
--num_nodes: Node number in use

🤖 Evaluation

1. 2D Evaluation (PSNR, SSIM, Clip-Score, LPIPS)

Please refer to eval_2d.py.

2. 3D Evaluation (Chamfer Distance, F-Score)

Please refer to eval_3d.py.

🤗 Acknowledgements

We thank the authors of the following projects for their excellent contributions to 3D generative AI!

In addition, we would like to express our sincere thanks to Jiale Xu for his invaluable assistance here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔥 AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

🚩 Todo List

📚 BibTeX

⚙️ Setup

1. Dependencies and Installation

2. Downloading Datasets

3. Downloading Checkpoints

⚡ Quick Start

1. Multi-View Synthesis

2. MV-to-3D Generation

3. Image-to-3D Generation

🚀 Training

🤖 Evaluation

1. 2D Evaluation (PSNR, SSIM, Clip-Score, LPIPS)

2. 3D Evaluation (Chamfer Distance, F-Score)

🤗 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
ar123		ar123
assets		assets
configs		configs
examples		examples
render		render
LICENSE		LICENSE
README.md		README.md
eval_2d.py		eval_2d.py
eval_3d.py		eval_3d.py
requirements.txt		requirements.txt
run.py		run.py
train.py		train.py

License

HVision-NKU/AR123

Folders and files

Latest commit

History

Repository files navigation

🔥 AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

🚩 Todo List

📚 BibTeX

⚙️ Setup

1. Dependencies and Installation

2. Downloading Datasets

3. Downloading Checkpoints

⚡ Quick Start

1. Multi-View Synthesis

2. MV-to-3D Generation

3. Image-to-3D Generation

🚀 Training

🤖 Evaluation

1. 2D Evaluation (PSNR, SSIM, Clip-Score, LPIPS)

2. 3D Evaluation (Chamfer Distance, F-Score)

🤗 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages