-
Notifications
You must be signed in to change notification settings - Fork 81
Description
Hello!
I've been researching options to find the simplest possible User Experience for people to create Gaussian Splat scans with minimal input. I found a Gaussian Object and I was really impressed by the results you showed in your paper! This is an amazing work!
I created a mobile app prototype with AI-guided capture experience with upload UI and backend implementation that runs Gaussian Object pipeline in the cloud, so all that the user has to do is to capture 4 images and then receive a scan on the mobile app.
However, during testing I was not able to achieve the same results you show in paper.
I was trying to replicate exactly the same setup you have, using the same images and default parameters on Github.
I tried to change resolution of input images and to modify all kinds of parameters in YAML config as well as Optimization config from init.py - but no luck of achieving same result.
I would greatly appreciate it if you can take a look at my setup and see what I am doing wrong.
My setup:
- As an input I am using the same 4 images you use from the "Kitchen" sample.
- Input images are downscaled client-side to 779 × 520.
- I am not using SAM, I am using another API to create masks automatically without needing for select the object manually. Alpha masks have the same resolution as input images
- Since images downscaled client-side. I am using resolution "1"
- I am using COLMAP-free variant with Dust3r. (I tried Mast3r, but it did not improve the output)
- When looking at the final result - it looks ok when looking from exactly the same views from which pictures were taken, but when rotating it - it's not looking correct.
My sequence:
- Inputs and Masks created and resized locally (779 × 520)
- preprocess/pred_monodepth.py (model: ZoeD_NK, resolution: 1)
- pred_poses.py (sparse_num: 4, using DUSt3R)
- train_gs.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True, init_pcd_name: 'visual_hull_4', iterations: 10_000, max_num_splats: 3_000_000)
- leave_one_out_stage1.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True)
- leave_one_out_stage2.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True)
- train_lora.py (prompt: 'Lego Tractor', sparse_num: 4, sh_degree: 2, resolution: 1, bg_white: True, use_dust3r: True)
- train_repair.py (config: gaussian-object-colmap-free.yaml, sparse_num: 4, prompt: 'Lego Tractor', sh_degree: 2, refresh_size: 8, resolution: 1)
What am I missing? Would really appreciate any guidance!
And thank you again for amazing work!
result.mov
