We need to evaluate the models more extensively. For example, add object detection metrics, topology metrics, etc.