Unable to replicate results from paper

Hello!
I've been researching options to find the simplest possible User Experience for people to create Gaussian Splat scans with minimal input. I found a Gaussian Object and I was really impressed by the results you showed in your paper!  This is an amazing work!

I created a mobile app prototype with AI-guided capture experience with upload UI and backend implementation that runs Gaussian Object pipeline in the cloud, so all that the user has to do is to capture 4 images and then receive a scan on the mobile app.

However, during testing I was not able to achieve the same results you show in paper.
I was trying to replicate exactly the same setup you have, using the same images and default parameters on Github. 
I tried to change resolution of input images and to modify all kinds of parameters in YAML config as well as Optimization config from __init__.py - but no luck of achieving same result.

I would greatly appreciate it if you can take a look at my setup and see what I am doing wrong.

My setup:
 - As an input I am using the same 4 images you use from the "Kitchen" sample.
 - Input images are downscaled client-side to 779 × 520. 
 - I am not using SAM, I am using another API to create masks automatically without needing for select the object manually. Alpha masks have the same resolution as input images
- Since images downscaled client-side. I am using resolution "1"
- I am using COLMAP-free variant with Dust3r. (I tried Mast3r, but it did not improve the output)
- When looking at the final result - it looks ok when looking from exactly the same views from which pictures were taken, but when rotating it - it's not looking correct.

My sequence:
1. Inputs and Masks created and resized locally (779 × 520)
2. preprocess/pred_monodepth.py (model: ZoeD_NK, resolution: 1)
3. pred_poses.py (sparse_num: 4, using DUSt3R)
4. train_gs.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True, init_pcd_name: 'visual_hull_4', iterations: 10_000, max_num_splats: 3_000_000)
5. leave_one_out_stage1.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True)
6. leave_one_out_stage2.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True)
7. train_lora.py (prompt: 'Lego Tractor', sparse_num: 4, sh_degree: 2, resolution: 1, bg_white: True, use_dust3r: True)
8. train_repair.py (config: gaussian-object-colmap-free.yaml, sparse_num: 4, prompt: 'Lego Tractor', sh_degree: 2, refresh_size: 8, resolution: 1)

What am I missing? Would really appreciate any guidance!
And thank you again for amazing work!


https://github.com/user-attachments/assets/12f37f4e-68c9-40ac-b72e-57d6a7afba46

<img width="990" height="733" alt="Image" src="https://github.com/user-attachments/assets/00f07c44-6bbd-4d9e-a887-eba376c4db05" />

<img width="924" height="711" alt="Image" src="https://github.com/user-attachments/assets/03c38c1a-c7f4-48d1-bbd7-69da92b12e66" />

<img width="935" height="716" alt="Image" src="https://github.com/user-attachments/assets/9be3e73e-5554-4da5-938e-42af393a7af6" />

<img width="835" height="676" alt="Image" src="https://github.com/user-attachments/assets/43af9745-9e04-4f70-b2d3-a2c3a306db12" />

<img width="816" height="689" alt="Image" src="https://github.com/user-attachments/assets/1ebe91d4-3c69-41a2-a7b2-9de9537b43fd" />




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to replicate results from paper #90

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unable to replicate results from paper #90

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions