-
Notifications
You must be signed in to change notification settings - Fork 497
Description
Describe your feature request
The BopWriter generates mask and mask_visib with pyrender. To calculate mask_visib, it uses the bop_toolkit, which basically thresholds the rendered depth from pyrender and Blender using a delta value.
This leads to problems with the mask_visib, most importantly that the mask_visib of a sample is not a true segmentation, meaning that pixel positions can belong to multiple objects. This happens for obvious reasons, especially with flat objects that are close together, as shown in the images below. The red pixels are overlaps of multiple mask_visib.
Another problem with the approach is the presence of artefacts in mask_visib, as shown in the images below. I suspect these occur when the foreground depth changes too rapidly from one pixel to the next. If trained directly on this signal, models will learn and predict these artefacts.
Describe a possible solution
Instead of approximating the segmentation through the depth, one can directly use the rendered segmentation from Blender. If segmentation output is enabled by calling bproc.renderer.enable_segmentation_output, the data object contains instance_segmaps and instance_attribute_maps that could be used directly by the BopWriter.


