Semantic Segmentation

Semantic segmentation algorithm attempts to:

Partition the image into meaningful parts
While at the same time, associate every pixel in an input image with a class label (i.e., person, road, car, bus, etc.)

Aim

To implement Semantic segmentation on images and videos using OpenCV and deep learning.

Approach

Semantic segmentation on images

Load the serialized model from disk.The semantic segmentation architecture we’re using here is ENet and it is trained on The Cityscapes Dataset, a semantic, instance-wise, dense pixel annotation of 20-30 classes
Construct a blob so that the input images have a resolution of 1024 x 512.
Set the blob as input to the network and perform a forward pass through the neural network.
Extract volume dimension information from the output of the neural network.
Generate classMap by finding the class label index with the largest probability for each and every pixel of the output image array
Compute color mask from the colors associated with each class label index in the classMap
Resize the mask to match the input image dimensions.
And finally, overlay the mask on the frame transparently to visualise the segmentation.

Semantic segmentation on videos

Load the serialized model from disk.
Open a video stream pointer to input video file on and initialize our video writer object.
Loop over the frames of the video and perform the same steps we performed for images.
Write the output frame to disk.

Output