-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Hi,
Thanks for your great work. I tried to test it on the RedWood bedroom dataset (http://redwood-data.org/indoor_lidar_rgbd/index.html) with downsampled RGB-D images (from 21930 to 219 frames, resolution 640x480), both original and downsampled pointcloud (~5M, 100k points) but cannot get reasonable outputs. After filtering it seems that only the first frame result is remained as I checked the camera pose by reprojecting the first frame depth into the scene scan point cloud. It says originally with 580 prompts in 3d proposal stage and 51 remains after 2d-guided filter. Then 15 after prompt consolidation.
There is one point I don't know whether I got it correct: in utils/main_utils.py:transform_pt_depth_scannet_torch(), it requires bx and by from camera intrinsic matrix. I don't know what they mean and set them to 0s.
Could you provide any insights on refining the results? e.g. lower image resolution for SAM, change filter parameters, etc.
Final segmented point cloud, the floor is segmented well, but for other parts seem only around the first frame viewpoint:

