Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize MCMC strategy and some tiny fix #3548

Merged
merged 12 commits into from
Jan 3, 2025
Merged

Conversation

KevinXu02
Copy link
Contributor

@KevinXu02 KevinXu02 commented Dec 14, 2024

Finalize MCMC strategy #3436, fix bilagird lr rate for splatfacto-big #3383 and some small changes to colmap dataparser (auto iterates possible colmap paths).
Tested on bicycle with random/sfm initialization.
To use: ns-train splatfacto-mcmc

@pablovela5620
Copy link
Contributor

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

@KevinXu02
Copy link
Contributor Author

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

Yes this might happen since MCMC uses a fixed number of gs for the scene. Could you please try add--pipeline.model.max_cap [max num of gs] in your training cmd? And a common number would be 1000000 but you can make it smaller.

@pablovela5620
Copy link
Contributor

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

Yes this might happen since MCMC uses a fixed number of gs for the scene. Could you please try add--pipeline.model.max_cap [max num of gs] in your training cmd? And a common number would be 1000000 but you can make it smaller.

Even setting to 500,000 leads to a OOM error sadly (I'm using a 3060), I had to reduce to 100,000 gaussian. That seems like too few(?) but I honestly don't know. I'll do some more testing

@gradeeterna
Copy link

gradeeterna commented Dec 20, 2024

I'm also getting OOM errors here with --pipeline.model.max-gs-num 1000000 on a 3090 with 24GB VRAM. Same dataset works fine with splatfacto and MCMC+bilagrid directly in gsplat.

@kerrj
Copy link
Collaborator

kerrj commented Dec 21, 2024

I'm also getting OOM errors here with --pipeline.model.max-gs-num 1000000 on a 3090 with 24GB VRAM. Same dataset works fine with splatfacto and MCMC+bilagrid directly in gsplat.

It's possible there are some memory differences between gsplat and nerfstudio because of dataloader overhead. When you run gsplat is the memory usage very close to 24GB? it may help to make sure the images are cached on CPU inside splatfacto-mcmc (there's a FullImagesDataManager parameter for this). If splatfacto-mcmc is taking significantly more memory than gsplat's version that would be surprising, but I think a difference of ~1GB is expected. MCMC in general will take more memory than the default strategy since the re-sampling step is a little memory hungry.

@vectorman1
Copy link

vectorman1 commented Jan 3, 2025

I've observed the OOM issue using Gsplat directly through simple_trainer & mcmc strategy - same dataset, running it N times, errors out with OOM at some point. I would say it's safe to assume that the nerfstudio code is not the origin.

What is this PR missing at this point in order to get it merged?

Copy link
Collaborator

@kerrj kerrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me now!

@kerrj kerrj enabled auto-merge (squash) January 3, 2025 19:31
@kerrj kerrj merged commit c7bd953 into nerfstudio-project:main Jan 3, 2025
3 checks passed
@abrahamezzeddine
Copy link

abrahamezzeddine commented Jan 4, 2025

Great job! Much appreciated!

I just tried this out myself and I am getting 8GB och VRAM with default MCMC settings, so no OOM here my part.

I however still see this typical haze sometimes over the. Zooming out, and you can see incredibly large splats being generated, which I suspect is the reason. With varying opacities for these splats, if they intersect the model in a specific view depending on your camera angle / orientation, a large haze covers the model.

With depth, zooming out of the model. You can see how the very large splats completetly overshadows the model.
image
Zooming in to the model, albeit outside the region of interest.
image

Some reference images. Doesn't the model prune very large splats?

@gradeeterna
Copy link

gradeeterna commented Jan 13, 2025

Not getting OOM errors anymore since updating to latest version!

However I am also getting these huge floaters / haze which cover the entire model in every MCMC scene, which I also had when using MCMC in gsplat. Unfortunately they are extremely difficult to cleanup in post, as they only show up from certain camera positions and are miles away from the main scene.

Here is a video showing the issue: https://www.youtube.com/watch?v=lnjtVM_oRtA

@hnguyen25
Copy link

This is great! Thank you for all the work getting this integrated.

I am also seeing the huge floaters/haze in the MCMC scene. Is there a plan to get this fixed anytime soon?

@bchretien
Copy link

To remove the huge splats, a quick (and probably dirty) workaround is to filter them here by adding large splats to the dead_mask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants