Dear authors, thank you for your excellent work. I have a minor question: why does the self-supervised GDR-Net model achieve better results on certain LineMOD objects (e.g., Duck, Iron, Phone) compared to the same GDR-Net model trained with ground truth labels?