-
Notifications
You must be signed in to change notification settings - Fork 0
Datasets
The following sections describe each of the tasks defined on each dataset. Except for cases where data splits are predefined by the dataset's original authors, we split each dataset into training, validation, and test images using a 7:1:2 split, using patient identifiers where applicable to ensure that there is no patient overlap between the sets.
The Annotated Ultrasound Liver (AUL) dataset (Xu et al., 2023) consists of 735 images, including 435 with malignant masses, 200 with benign masses, and 100 with no masses. Each image is of a different patient, with a mean width of 945.33 px (
We define two segmentation tasks and one classification task on the AUL dataset, a liver segmentation task using the 734 images with liver annotations, a liver mass segmentation task on all 735 images, and a mass classification task to classify the images according to the type of mass present in the images. The liver segmentation task contains one less training image than the other tasks due to a missing liver segmentation mask in the source dataset.
The Butterfly dataset (Butterfly Network, 2018) was released for the 2018 MIT Grand Hack. It consists of ultrasound images of multiple body regions acquired using the Butterfly iQ point-of-care ultrasound device from 31 patients. The images are divided into nine groups according to the organ being imaged (morison's pouch, bladder, heart (PLAX view), heart (4-chamber view), heart (2-chamber view), IVC, carotid artery, lungs, and thyroid). We use these labels to define a nine-class image classification task. In total, the dataset consists of 41,076 images, 34,325 of which are allocated to training and validation, while 6751 are reserved for testing. The images have an average width of 415.57 px (
The Cardiac Acquisitions for Multi-structure Ultrasound Segmentation (CAMUS) dataset (Leclerc et al, 2018) consists of apical four-chamber and two-chamber view cardiac ultrasound sequences from 500 patients for a total of 19,232 images. The metadata provided with each image includes segmentation masks for the left ventricle endocardium, myocardium, and the left atrium. Each image is also labeled according to the quality of the scan (poor, medium, and good). The images have an average width of 597.58 px (
The Dataset of B-mode fatty liver ultrasound images (Byra et al., 2018), referred to simply as the Fatty Liver dataset from here on, contains 550 liver ultrasound images from 55 patients, with 38 suffering from non-alcoholic fatty liver disease (NFLD; defined as >5% of hepatocytes having fatty infiltration). The images all have a resolution of
The Gallbladder Cancer Ultrasound (GBCU) dataset (Basu et al., 2022) contains a total of 1255 annotated abdominal ultrasound images (432 normal, 558 benign, and 265 malignant) collected from 218 patients (71 normal, 100 benign, and 47 malignant). The images have an average width of 1204.95 px (
The Multi-Modality Ovarian Tumor Ultrasound (MMOTU) dataset (Zhao et al., 2023) is an ovarian cancer dataset consisting of 2D ultrasound and contrast-enhanced ultrasonography (CEUS) images. In this case, we are interested only in the ultrasound images. In total there are 1469 2D ultrasound images with semantic segmentation masks identifying the tumour in each image. In addition, each image is labeled according to the presence of each type of tumour (chocolate cyst, serous cystadenoma, teratoma, theca cell tumour, simple cyst, normal ovary, mucinous cystadenoma, and high grade serous cystadenocarcinoma). This allows us to define two tasks on this dataset: binary tumour segmentation and multi-class tumour type classification. As is the case of the GBCU dataset, this dataset is pre-split into training and testing sets, with 1000 examples collected from 171 patients in the training set and 469 examples collected from 76 patients in the test set. We further split the training set into training and validation sets using an 80:20 split, but since all patient information has been removed from the dataset we cannot ensure that there is no patient overlap between the training and validation splits. The images have an average width of 550.84 px (
The Open Kidney Ultrasound dataset (Singla et al., 2023) consists of 514 B-mode kidney ultrasound images, each from a distinct patient. The images are annotated with kidney capsule pixel masks, which permits two separate semantic segmentation tasks: kidney capsule and a more fine-grain kidney regions segmentation. However, the limited amount of data relative to the complexity of the region segmentation task means that training informative, effective models in this setting is not possible. The images have an average width of 1061.92 px (
The Point-of-care Ultrasound (POCUS) dataset (Born et al., 2021) is a collection of convex and linear probe lung ultrasound images and videos from different sources that was created for the diagnosis of COVID-19. We use the 142 convex probe videos and 29 convex probe images distributed by the authors and follow the procedure described in their original paper to process them, sampling the videos at a rate of 3 Hz, up to a maximum of 30 frames, and grouping the frames by video to prevent data leakage between the train, validation, and test splits. In total, we extract 2726 examples. Each image is labeled by the pathology (healthy, pneumonia, covid) yielding a three-class classification problem. The images have an average width of 499.22 px (
The PSFHS dataset (Chen et al., 2024) is a dataset for fetal head and pubic symphysis segmentation, comprising 1358 images from 1124 patients. Each image is accompanied by pixel-level segmentation masks for the fetal head and pubic symphysis, supporting a three-class image segmentation task (background, pubic symphysis, and fetal head). The images all have a resolution of
The Stanford Thyroid Ultrasound Cine-clip dataset (Stanford AIMI Center, 2021), referred to simply as the Stanford Thyroid dataset from here on, is a dataset of 192 thyroid nodule ultrasound cine-clips (videos) collected from 167 patients. The images in each sequence are associated with pixel-level nodule segmentation masks, patient demographics, lesion size and location, TI-RADS descriptors, and histopathological diagnoses. We use the nodule masks for a thyroid nodule segmentation task. In total, there are 17,412 images all with a resolution of