This repository contains the works for medical research scientist test!
You could find the lik to the challenge here:
Medical Research Scientist Test
Every DICOM scan image has a meta data built inside it, containing useful information about the image. Pixel spacing, slice thickness, patient position and orientation are the ones extracted for this section.
The class dicom_object generates the outputs for mean, max and centre of the scan.
Inputs to the class object is the path to dicom images folder.
Method descriptions are below:
Method
get_meta
This method returns the dimensions, pixel spacing, slice spacing and slice thickness.
Method
get_max_intensity
Returns the maximum intensity od the scan. Hounsfield unit -2048 is removed to account for the scan margins.
Method
get_mean_intensity
Returns the mean intensity od the scan. Hounsfield unit -2048 is removed to account for the scan margins.
Method
get_center
Returns the center of the scan. Converts images ijk coordinates (pixel coordinates) to the native scanner coordinates via affine transformation. We read the patient position for the middle slice of the stack and carry out the affine transformation according to the following formula:
we have:
where A is the affine transformation matrix. xyz is the scanner coordinates and ijk is the pixel coordinates in the image.
In matrix form for a single slice CT scanner:
Where Ori is the patient orientation, Pos is the patient position in the slice, Ps is the pixel spacing.
The class mask_object segments the metal objects per slice and writes the corresponding binary mask files in the masks folder.
CT images of the test are stores with Hounsfield units as intensities. Metal objects have much greater intensities than tissue and bone. Therefore, A simple thresholding followed by a morphology opening filter could generate acceptable result. Morphology opening is simply an erosion followed by dilation aiming for removing white spots outside of the segmented area.
Link to the page: Morphological opening
The unet_object class is a Keras based U-net image segmentation method. U-Net developed at the University of Freiburg is a deep convolutional neural network architecture, especifically designed for medical image segmentation. It uses a encoder-decoder approach through a symmetrical pathway in which layers of the encoder are copied to the corresponding layers of the decoder. Downsampling the images to features and then upsampling them to the masks are normally used in segmentation DNNs such as Mask-RCNN or U-Net.
A schematic view of U-net Architecture is seen below:
Class methods are:
Method
__init__
is the class constructor. Takes the following arguments:
- image_path: path to the DICOM images folder
- masks_path: path to the masks folder
- num_epochs: the number of epochs for trainig (default = 10)
- batch_size: size of the batch for feedng to the mode per step (default = 15)
- learning_rate: learning rate for the optimizer (default = 1e-4)
- dropout: dropout rate for the network (defualt = 0.5)
- iou_metric: IOU (intersection over union) metric for image similarity (optional, default = False)
- height: height of the image (default = 512)
- width: width of the image (default = 512)
- dim: image depth (dimensions), (default = 1 for a binary image)
Assuming that an image should be fed to the network as is, I haven't done ant preprocessing. The only processing was removing -2048 intensity values fro the image. Contrast stretchig, cropping plus other image processing techniques coud improve the results.
The optimizer used is Adam optimizer and the loss function is a weighted binary class entropy function to handle the large class imbalance existing in the images. Dice loss function is a good option as well but should be used with class weights.
Method
unet_unit
Creates the model structure. Refer to the U-Net documentations for details about U-Net atcitechture.
I used data augmentation using the Method
get_augmented_data
As the number of images (200) may not be enough for training a deep CNN, I used the convenient ImageDataGenerator Class from Keras to generate augmented data. It streams the generated images in batches when training the model.
There are many image similarity methods for comparing images, especially in object detection tasks. Euclidean distance is one of them and in binay images it is equivalent to sum squared error. Dice index is also a good option as it accounts for the intersection of the images.
methods
get_mse(mask_path, out_path) ## inputs: paths to real masks (mask_path) and generated masks (out_path) folders
which returns values across all images,
and
dice(mask, out) ## inputs are single images. mask for the real and out for the generated mask.
which returns the dice index between two images.
calculates the metrics for image similarirty.
I have only run the network for 5 epochs. In reality it should run for many more, however, I had hardware and time limitations.
Pretrained weights for 5 epochs could be found here:

