Skip to content

Unable to replicate results from paper #90

@AlexTyu

Description

@AlexTyu

Hello!
I've been researching options to find the simplest possible User Experience for people to create Gaussian Splat scans with minimal input. I found a Gaussian Object and I was really impressed by the results you showed in your paper! This is an amazing work!

I created a mobile app prototype with AI-guided capture experience with upload UI and backend implementation that runs Gaussian Object pipeline in the cloud, so all that the user has to do is to capture 4 images and then receive a scan on the mobile app.

However, during testing I was not able to achieve the same results you show in paper.
I was trying to replicate exactly the same setup you have, using the same images and default parameters on Github.
I tried to change resolution of input images and to modify all kinds of parameters in YAML config as well as Optimization config from init.py - but no luck of achieving same result.

I would greatly appreciate it if you can take a look at my setup and see what I am doing wrong.

My setup:

  • As an input I am using the same 4 images you use from the "Kitchen" sample.
  • Input images are downscaled client-side to 779 × 520.
  • I am not using SAM, I am using another API to create masks automatically without needing for select the object manually. Alpha masks have the same resolution as input images
  • Since images downscaled client-side. I am using resolution "1"
  • I am using COLMAP-free variant with Dust3r. (I tried Mast3r, but it did not improve the output)
  • When looking at the final result - it looks ok when looking from exactly the same views from which pictures were taken, but when rotating it - it's not looking correct.

My sequence:

  1. Inputs and Masks created and resized locally (779 × 520)
  2. preprocess/pred_monodepth.py (model: ZoeD_NK, resolution: 1)
  3. pred_poses.py (sparse_num: 4, using DUSt3R)
  4. train_gs.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True, init_pcd_name: 'visual_hull_4', iterations: 10_000, max_num_splats: 3_000_000)
  5. leave_one_out_stage1.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True)
  6. leave_one_out_stage2.py (sparse_num: 4, sh_degree: 2, resolution: 1, white_background: True, random_background: True, use_dust3r: True)
  7. train_lora.py (prompt: 'Lego Tractor', sparse_num: 4, sh_degree: 2, resolution: 1, bg_white: True, use_dust3r: True)
  8. train_repair.py (config: gaussian-object-colmap-free.yaml, sparse_num: 4, prompt: 'Lego Tractor', sh_degree: 2, refresh_size: 8, resolution: 1)

What am I missing? Would really appreciate any guidance!
And thank you again for amazing work!

result.mov
Image Image Image Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions