Skip to content

A. Trained Models

A. Yilmaz edited this page Nov 20, 2024 · 3 revisions

Annotation results of the trained models for strawberry and tomato use cases are indicated below.

strawberry annotation and prediction

Strawberry: Original Image (Col 1, Row 1), Depth Image (Col 1, Row 2), Annotated Image (Col 2, row 1), Predicted Image (Col 2, row 2), Annotated Depth (Col 3, Row 1), Predicted Depth (Col 3, Row 2)

tomato annotation and prediction

Tomato: Original Image (Col 1, Row 1), Depth Image (Col 1, Row 2), Annotated Image (Col 2, row 1), Predicted Image (Col 2, row 2), Annotated Depth (Col 3, Row 1), Predicted Depth (Col 3, Row 2)


Pre-trained model details are investigated hereafter in terms of training datasets, followed methodology, and performance evaluation results.

The dataset

Strawberry

This dataset contains strawberry images of two varieties, Driscoll's Katrina and Driscoll's Zara, grown at the University of Lincoln at Riseholme during the summer of 2021 and fall of 2022. A total of 380 images are annotated with class label fruit in the first annotation and with class labels a. ripe, b. unripe in the second annotation. Both annotations are provided in the dataset. The image and annotation attributes are given in Table 1.

Attributes Value
Camera Name Framos (20), Realsense (360)
Camera Type RGB-D
Spatial Resolution 1280x720 (359) / 640x480 (20)
Image count 380
Image Distribution (TRAIN, TEST, VAL) (65,25,10) %
Class Distribution (Ripe, Unripe) (47,53) %
Illuminant Day light
Fruit count 11456
Fruit distribution (TRAIN, TEST, VAL) (70,20,10) %

Tomato

The dataset comprises tomato images from green-house (Flavourfresh), poly-tunnel (UoL) and package house (Flavourfresh). The images were taken between August 2023 to August 2024. The variety is Piccolo vine tomatoes. Similar to the strawberry dataset the annotations are provided in two COCO json files one having a single label (fruit) and the other having two labels (ripe and unripe).

Attributes Value
Camera Names Framos, Realsense, Desptech 4K, iPhone 8
Camera Type RGB, RGBD
Spatial Resolution 1280x720 (21), 1920x1080 (40), 3840x2160 (21), 2048x1536 (70)
Image count 151
Image Distribution (TRAIN, TEST, VAL) (67,23,10) %
Class Distribution (Ripe, Unripe) (36,64) %
Illuminant Day light
Fruit count 14821
Fruit distribution (TRAIN, TEST, VAL) (68.2,22.7,9.1) %

Methodology

A model is trained by Detectron2 Mask-RCNN from the annotations, a base model of Imagenet data trained on COCO Mask-RCNN was used on which futher training was performed on top of it. The SGD optimizer is employed for training with 60000 iterations. The distribution of train, test and validation datasets is given in Table 1. The augmentation parameters associated with training are a.Random crop b. Random Flip c. Random brightness within a given range d. Random contrast within a given range.

It should be noted that all training and validations are performed on RGB images only and depth channel is not employed during these processes. The depth channel is segmented later according to the mask prediction on RGB images. The dataset do not provide depth images corresponding to the colour images.

Performance Evaluation

The average precision (AP) at different IoU intervals is given in Table 2.

Strawberry

Model Category AP (IoU=0.50:0.95) AP50 (IoU=0.50) AP75 (IoU=0.75)
Strawberry Fruit Fruit 35.04 51.88 39.14
Strawberry Ripeness Ripe & Unripe 32.63 52.49 35.77
Strawberry Ripeness Ripe 26.35 - -
Strawberry Ripeness Unripe 38.91 - -

Tomato

Model Category AP (IoU=0.50:0.95) AP50 (IoU=0.50) AP75 (IoU=0.75)
Tomato Fruit Fruit 20.22 36.06 19.83
Tomato Ripeness Ripe & Unripe 19.10 33.40 19.47
Tomato Ripeness Ripe 18.71 - -
Tomato Ripeness Unripe 19.48 - -

Clone this wiki locally