STORM

Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes

Highlights

Fast, feed-forward, and self-supervised dynamic scene reconstruction from sparse multi-view sequences
Learns 3D Gaussian and scene flow jointly; supports real-time rendering (once Gaussians are generated) and motion segmentation
Outperforms per‑scene optimization and other generalizable models by +4 dB PSNR on dynamic regions while being significantly faster

Installation

Tested with CUDA 12.1, PyTorch 2.3 and an NVIDIA A100
Replace the CUDA/PyTorch versions as needed for your environment

# clone project
git clone https://github.com/NVlabs/GaussianSTORM.git
cd GaussianSTORM

# create conda environment
conda create -n storm python=3.10 -y
conda activate storm

# install python dependencies
pip install -r requirements.txt

# install gsplat (for batch-wise rendering support)
pip install git+https://github.com/nerfstudio-project/gsplat.git@2b0de894232d21e8963179a7bbbd315f27c52c9c
#   └─ if the above fails, drop the commit hash:
#       pip install git+https://github.com/nerfstudio-project/gsplat.git

Note: installing gsplat can be machine-dependent and sometimes be tricky. if you encounter issues, please refer to the original gsplat repository for troubleshooting.

Quick Start (Playground)

We provide a tiny subset of Waymo Open Dataset (3 sequences) for quick experimentation:

# download dataset subset (≈ 600 MB)
gdown 14fapsAGoMCQ5Ky82cg2X6bk-mLQ7fdCF
tar -xf STORM_subset.tar.gz

# run single-GPU inference demo
python inference.py \
    --project storm_playground --exp_name visualization \
    --data_root data/STORM_subset \
    --model STORM-B/8 --num_motion_tokens 16 \
    --use_sky_token --use_affine_token \
    --load_depth --load_flow --load_ground \
    --load_from $CKPT_PTH

CKPT_PTH refers to the checkpoint. We cannot share an official checkpoint at this moment. Please refer to the issue page for an unofficial checkpoint.

Dataset Preparation

Waymo Dataset

To prepare the Waymo Open Dataset, please refer to Waymo Data

Other datasets

We haven't included instructions for preparing NuScenes and Argoverse2 datasets. We might include these based on the capacity.

Training

Multi-GPU example that reproduces the paper's STORM-B/8 model:

# with a global batch size= num_gpus * batch_size = 8 * 4 = 32 (We used 64 global batch size for main experiments)
torchrun --nproc_per_node=8 main_storm.py \
    --project 0504_storm \
    --exp_name 0504_pixel_storm \
    --data_root ../storm2.3/data/STORM2 \ # replace this with your data root.
    --batch_size 4 --num_iterations 100000 --lr_sched constant \
    --model STORM-B/8 --num_motion_tokens 16 \
    --use_sky_token --use_affine_token \
    --load_depth --load_flow --load_ground \
    --enable_depth_loss --enable_flow_reg_loss --flow_reg_coeff 0.005 --enable_sky_opacity_loss \
    --enable_perceptual_loss --perceptual_loss_start_iter 5000 \
    --enable_wandb \
    --auto_resume

Tips:

Checkpoints and logs are saved at work_dirs/<project>/<exp_name>

batch_size is per-GPU; global batch = batch_size × #GPUs × #nodes

For additional arguments, see main_storm.py

Evaluation

torchrun --nproc_per_node=8 main_storm.py \
    --project 0504_storm \
    --exp_name 0504_pixel_storm \
    --data_root ../storm2.3/data/STORM2 \ # replace this with your data root.
    --batch_size 4 --num_iterations 100000 --lr_sched constant \
    --model STORM-B/8 --num_motion_tokens 16 \
    --use_sky_token --use_affine_token \
    --load_depth --load_flow --load_ground \
    --enable_depth_loss --enable_flow_reg_loss --flow_reg_coeff 0.005 --enable_sky_opacity_loss \
    --enable_perceptual_loss --perceptual_loss_start_iter 5000 \
    --auto_resume \
    --evaluate # this parameter specifies the evaluation mode

TODO

[ ] Viewers.

Citation

@inproceedings{yang2025storm,
  title   = {STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes},
  author  = {Jiawei Yang and Jiahui Huang and Yuxiao Chen and Yan Wang and Boyi Li and Yurong You and Maximilian Igl and Apoorva Sharma and Peter Karkus and Danfei Xu and Boris Ivanovic and Yue Wang and Marco Pavone},
  booktitle = {ICLR},
  year    = {2025}
}

License

This project is licensed under the NVIDIA License. See the LICENSE file for details.

Acknowledgements

Our implementation builds upon gsplat. We thank the respective authors for open‑sourcing their excellent work.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.vscode		.vscode
assets		assets
data/dataset_scene_list		data/dataset_scene_list
docs		docs
preproc		preproc
storm		storm
third_party/depth_anything_v2		third_party/depth_anything_v2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
engine_storm.py		engine_storm.py
extract_sky.py		extract_sky.py
inference.py		inference.py
main_storm.py		main_storm.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
requirements_data_preprocess.txt		requirements_data_preprocess.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

STORM

Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes

Highlights

Installation

Quick Start (Playground)

Dataset Preparation

Waymo Dataset

Other datasets

Training

Evaluation

TODO

Citation

License

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

NVlabs/GaussianSTORM

Folders and files

Latest commit

History

Repository files navigation

STORM

Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes

Highlights

Installation

Quick Start (Playground)

Dataset Preparation

Waymo Dataset

Other datasets

Training

Evaluation

TODO

Citation

License

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages