GitHub - CyberAgentAILab/MVSCPS: [ICCV 2025] Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues

Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues

Xu Cao · Takafumi Taketomi

ICCV 2025

Overview

MVSCPS jointly recovers geometry, reflectance, and lighting from multi-view one-light-at-a-time (OLAT) images, featuring:

No light calibration required
Single-stage, end-to-end optimization; thereby no intermediate photometric stereo step
Flexible camera-light configurations. In the extreme case, the camera and light source can move independently for each shot.

Getting Started

Environment Setup

Installation (Linux, uv, CUDA 11.8)

chmod +x install_env.sh
./install_env.sh

What this installs

A Python 3.9 virtual environment: .venv-mvscps
If uv is not detected, the script automatically installs it.
PyTorch 2.5.1, torchvision 0.20.1 (CUDA 11.8 wheels)
Source-built extensions: tiny-cuda-nn (Torch bindings) and nerfacc
Training/config stack: pytorch-lightning 1.9.5, hydra-core, omegaconf
Visualization & scientific libs: matplotlib, pyvista, open3d, opencv-python, imageio[ffmpeg], scipy, scikit-image, trimesh, lpips, tensorboard, wandb, huggingface_hub, etc.

After successfully setting up the environment, you should see the following output:

Python: 3.9.23
[OK] Torch 2.5.1+cu118 | CUDA: True
[OK] VTK 9.2.6 | PyVista 0.37.0
[OK] PyVista import: OK
[OK] Trimesh ray engine: trimesh.ray.ray_pyembree.RayMeshIntersector
[OK] nerfacc 0.3.3 | path: /MVSCPS/.venv-mvscps/lib/python3.9/site-packages/nerfacc
[OK] nerfacc CUDA extension import: nerfacc.cuda
[OK] tinycudann forward OK on cuda: output shape (128, 16)

If you are using Docker, including the following lines in your Dockerfile should be sufficient:

FROM --platform=linux/amd64 docker.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

# Install dependencies
RUN apt-get update && apt-get -y install python3-pip vim cmake openssh-server build-essential htop nvtop git wget curl unzip zip bash-completion sudo libgl1-mesa-glx xvfb rsync tmux libglib2.0-0 libbz2-dev ffmpeg libsm6 libxext6 && \
    ln -s /usr/bin/python3 /usr/bin/python

Data Preparation

Download and preprocess DiLiGenT-MV dataset (~7GB):

. data/prepare_data_diligentmv.sh

This script downloads the DiLiGenT-MV dataset in data/DiLiGenT-MV_origin, reorganize the file structures in data/DiLiGenT-MV, and calculate the scene normalization parameters for training.

Download and preprocess our self-collected dataset (~60GB):

. data/prepare_data_mvscps.sh

This script downloads our self-collected data from HuggingFace Dataset and calculate the scene normalization parameters for training.

Training

. ./launch_diligentmv.sh

This script trains MVSCPS on all 5 scenes in DiLiGenT-MV dataset sequentially. OLAT images captured from 18 views under 32 lights per view are used for training for each scene. The training takes about 15 minutes per scene on a single NVIDIA A100 GPU. Testing, BRDF map rendering, and relighting are performed after training, and takes about 20~30 minutes per scene. The trained models and results are saved in exp/diligentmv.

Empirical evidences that the training is converging:

test/mae_light drops below 3 degrees within 800 iterations.
test/mae_normal drops below 10 degrees within 3000 iterations.
train/inv_s increases to above 2000 within 10000 iterations.

For training on our self-collected dataset, run the following command:

. ./launch_mvscps.sh

This script trains MVSCPS on all 6 scenes in our self-collected dataset sequentially. The training takes about 80 minutes per scene on a single NVIDIA A100 GPU. The trained models and results are saved in exp/mvscps.

Tips

Coordinate system. We follow the OpenCV convention: x → right, y → down, z → forward.
Controlling train/val/test subsets. We use a plain-text index file to specify which image subsets are used for training/validation/testing. Sample files are provided under configs/view_light_indices. You can prepare your own file and set dataset.train.view_light_index_fname and dataset.train.view_light_index_file in the config file.
Using your own data. Please refer to the folder structure in our HuggingFace Dataset and the preprocessing script data/preprocess_data_mvscps.py. After your data is prepared, implement your custom image loader in dataloader/load_fn, then configure dataset.img_load_fn, dataset.img_ext, and dataset.img_dirname in configs/conf/mvscps.yaml. This config file should be fine for your custom data.
RAW size mismatch & cropping. When loading RAW images with RawPy, the resulting image can be slightly larger than the size recorded in EXIF. For this reason our loader applies a small crop (code reference). Note that the required crop region varies by camera vendor (we confirmed differences between Sony and Canon). A practical way to determine the correct crop for your camera is to compare the RAW-rendered image with its in-camera JPEG (e.g., visualize their difference) and adjust the crop until edge discrepancies disappear.

Acknowledgements

We thank the open-source project instant-nsr-pl, distributed under the MIT license.

Copyright (c) 2022 Yuanchen Guo

License

This project is licensed under the CC BY-NC 4.0 license. You may use, share, and adapt the material for non-commercial purposes with appropriate credit.

Citation

@inproceedings{mvscps2025cao,
  title = {Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues},
  author = {Cao, Xu and Taketomi, Takafumi},
  year = {2025},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
core		core
data		data
dataloader		dataloader
media		media
models		models
systems		systems
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-instant-nsr-pl.txt		LICENSE-instant-nsr-pl.txt
install_env.sh		install_env.sh
launch.py		launch.py
launch_diligentmv.sh		launch_diligentmv.sh
launch_mvscps.sh		launch_mvscps.sh
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues

Xu Cao · Takafumi Taketomi

ICCV 2025

Overview

Getting Started

Environment Setup

Data Preparation

Training

Tips

Acknowledgements

License

Citation

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues

Xu Cao · Takafumi Taketomi

ICCV 2025

Overview

Getting Started

Environment Setup

Data Preparation

Training

Tips

Acknowledgements

License

Citation

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages