Finalize MCMC strategy and some tiny fix #3548

KevinXu02 · 2024-12-14T07:45:30Z

Finalize MCMC strategy #3436, fix bilagird lr rate for splatfacto-big #3383 and some small changes to colmap dataparser (auto iterates possible colmap paths).
Tested on bicycle with random/sfm initialization.
To use: ns-train splatfacto-mcmc

pablovela5620 · 2024-12-17T02:29:33Z

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

KevinXu02 · 2024-12-17T08:01:22Z

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

Yes this might happen since MCMC uses a fixed number of gs for the scene. Could you please try add--pipeline.model.max_cap [max num of gs] in your training cmd? And a common number would be 1000000 but you can make it smaller.

nerfstudio/data/dataparsers/colmap_dataparser.py

pablovela5620 · 2024-12-17T13:15:30Z

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

Yes this might happen since MCMC uses a fixed number of gs for the scene. Could you please try add--pipeline.model.max_cap [max num of gs] in your training cmd? And a common number would be 1000000 but you can make it smaller.

Even setting to 500,000 leads to a OOM error sadly (I'm using a 3060), I had to reduce to 100,000 gaussian. That seems like too few(?) but I honestly don't know. I'll do some more testing

nerfstudio/models/splatfacto.py

nerfstudio/configs/method_configs.py

nerfstudio/models/splatfacto.py

gradeeterna · 2024-12-20T21:50:52Z

I'm also getting OOM errors here with --pipeline.model.max-gs-num 1000000 on a 3090 with 24GB VRAM. Same dataset works fine with splatfacto and MCMC+bilagrid directly in gsplat.

nerfstudio/configs/method_configs.py

kerrj · 2024-12-21T15:02:14Z

I'm also getting OOM errors here with --pipeline.model.max-gs-num 1000000 on a 3090 with 24GB VRAM. Same dataset works fine with splatfacto and MCMC+bilagrid directly in gsplat.

It's possible there are some memory differences between gsplat and nerfstudio because of dataloader overhead. When you run gsplat is the memory usage very close to 24GB? it may help to make sure the images are cached on CPU inside splatfacto-mcmc (there's a FullImagesDataManager parameter for this). If splatfacto-mcmc is taking significantly more memory than gsplat's version that would be surprising, but I think a difference of ~1GB is expected. MCMC in general will take more memory than the default strategy since the re-sampling step is a little memory hungry.

vectorman1 · 2025-01-03T12:39:11Z

I've observed the OOM issue using Gsplat directly through simple_trainer & mcmc strategy - same dataset, running it N times, errors out with OOM at some point. I would say it's safe to assume that the nerfstudio code is not the origin.

What is this PR missing at this point in order to get it merged?

kerrj

I think this looks good to me now!

abrahamezzeddine · 2025-01-04T23:00:56Z

Great job! Much appreciated!

I just tried this out myself and I am getting 8GB och VRAM with default MCMC settings, so no OOM here my part.

I however still see this typical haze sometimes over the. Zooming out, and you can see incredibly large splats being generated, which I suspect is the reason. With varying opacities for these splats, if they intersect the model in a specific view depending on your camera angle / orientation, a large haze covers the model.

With depth, zooming out of the model. You can see how the very large splats completetly overshadows the model.

Zooming in to the model, albeit outside the region of interest.

Some reference images. Doesn't the model prune very large splats?

gradeeterna · 2025-01-13T11:26:26Z

Not getting OOM errors anymore since updating to latest version!

However I am also getting these huge floaters / haze which cover the entire model in every MCMC scene, which I also had when using MCMC in gsplat. Unfortunately they are extremely difficult to cleanup in post, as they only show up from certain camera positions and are miles away from the main scene.

Here is a video showing the issue: https://www.youtube.com/watch?v=lnjtVM_oRtA

hnguyen25 · 2025-02-13T23:00:23Z

This is great! Thank you for all the work getting this integrated.

I am also seeing the huge floaters/haze in the MCMC scene. Is there a plan to get this fixed anytime soon?

bchretien · 2025-02-14T10:32:34Z

To remove the huge splats, a quick (and probably dirty) workaround is to filter them here by adding large splats to the dead_mask.

KevinXu02 added 5 commits December 14, 2024 15:21

mcmc and some small fix

f4db429

int->float

09b9cc6

Use sfm init for mcmc, and fix pyright

acc1301

block mcmc test

197e287

back to floor

29dd196

1.5m->1m

0715323

lionlai1989 reviewed Dec 17, 2024

View reviewed changes

nerfstudio/data/dataparsers/colmap_dataparser.py Outdated Show resolved Hide resolved