Skip to content

Gripper segment custom models #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

evelyn-fu
Copy link
Contributor

@evelyn-fu evelyn-fu commented Apr 11, 2025

Add support for using custom SAM2 or Grounding DINO models for gripper segmentation

Failed segmentation using default Grounding DINO + SAM2 models:
070890

Successful segmentation using fine tuned Grounding DINO + SAM2 models:
070890

.gitignore Outdated
@@ -5,3 +5,4 @@ checkpoints
__pycache__
outputs
.vscode
~/
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, forgot to remove that after adding it in from a local git annoyance

We provide fine tuned networks for SAM2 and GroundingDINO for the segmentation and annotation
of the gripper used in our provided dataset which can be downloaded from [here](https://mitprod-my.sharepoint.com/personal/nepfaff_mit_edu/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fnepfaff%5Fmit%5Fedu%2FDocuments%2Fscalable%5Freal2sim%5Fmodel%5Fweights&ga=1).

Please put the downloaded checkpoint files in the `./checkpoints` directory.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:nit: That directory doesn't currently exist. It might make sense to create it and put a .gitignore file inside that includes all files inside this directory apart from the directory itself (you can use ! to exclude the dir).

To fine tune your own segmentation model for your gripper, see [these instructions](https://github.com/facebookresearch/sam2/blob/main/training/README.md) for training from the
SAM2 Official Github.

An example of segmentation failure on the gripper with default models: \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate a bit here for why it can fail without fine-tuning? It could be worth mentioning that our particular gripper seems to be out of distribution for SAM2, and thus it looses track of it for long videos. This can also be solved with re-prompting it after failure, segmenting the video in parts but fine-tuning removes this extra step.

README.md Outdated
SAM2 Official Github.

An example of segmentation failure on the gripper with default models: \
<img src="assets/mask_sam2_default.png" width="200">
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:nit: Maybe "_failure" is more appropriate than "_default".

@@ -108,6 +108,30 @@ automatic annotations using DINO work well in many cases but can struggle with t
gripper masks. All downstream object tracking and reconstruction results are sensitive
to the segmentation quality and thus spending a bit of effort here might be worthwhile.

##### Gripper masking with fine tuned models
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to also make this an option for the run_asset_generation.py pipeline. Maybe add a store_true argument to enable it.
In that case, we have to think about where to put the installation instructions. We could keep them here but refer to them here when explaining that argument.

@@ -112,7 +117,22 @@ def segment_moving_obj_data(
"Text prompt must be provided if GUI frames are not specified."
)

sam2_checkpoint = "./checkpoints/sam2_hiera_large.pt"
if txt_prompt == "gripper":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would again check if the install was successful by adding MMDET_AVAILABLE to the if statement. You could print another warning here that mentions that we are falling back to the default model (if `txt_prompt == "gripper" frame).

sam2_checkpoint = gripper_sam2_path

if not os.path.exists(sam2_checkpoint):
logging.info(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:nit: This should be an error.

)
if txt_prompt == "gripper":
# Use mmdetection api for gripper
with autocast(enabled=False):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need autocost if we set enabled=False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why this caused an error if I didn't have this here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I'd add a comment that mentions that it is needed to avoid errors

@@ -653,6 +712,16 @@ def get_dino_boxes(text, frame_idx):
except Exception as e:
logging.error(f"Error deleting {f}: {e}")

# save black masks for gripper not found frames
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:nit: Starting with lower case breaks the file convention

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, adding a bit more detail to that comment might be helpful here. Am I understanding it correctly that you are writing black masks until the gripper appears in the frame?

input_boxes, confidences, class_names = get_dino_boxes(
txt_prompt, txt_prompt_index
)
while True:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:nit: Could you add a comment for what this is doing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants