Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Semantic Segmentation project

Goals:

  • Implement a Fully Convolutional Network (FCN).
  • Use it to perform semantic segmentation on images, detecting which pixels belong to the road.

Dependencies

Frameworks and Packages

Make sure you have the following is installed:

Dataset

Download the Kitti Road dataset from here. Extract the dataset in the data folder. This will create the folder data_road with all the training a test images.

Rubric Points

Build the Neural Network

Does the project load the pretrained vgg model?

It does, refer to main.py lines 20-42. In particular, we do:

tf.saved_model.loader.load(sess, [vgg_tag], vgg_path)
graph = tf.get_default_graph()

And from the graph we obtain the pretrained tensors by name.

Does the project learn the correct features from the images?

It does, refer to main.py lines 46-63. It's leveraging the VGG model and adding the upsampling layers, following the model described in the Fully Convolutional Networks for Semantic Segmentation whitepaper.

Does the project optimize the neural network?

It does, refer to main.py lines 67-81. We compute the softmax cross entropy between logits and labels and use an Adam algorithm optimizer to minimize the cross entrpy loss.

Does the project train the neural network?

Yes. Refer to main.py lines 85-109. It runs as per the specified number of epochs, using the batch_size parameter to obtain batched sets of training data. The loss of the network is printed while the network is training after each batch is processed.

Neural Network Training

Does the project train the model correctly?

On average, the model decreases loss over time. The gains are big early on in the training but seem to stabilize by epoch 200.

Does the project use reasonable hyperparameters?

I found that the cross entropy loss doesn't decrease much after 200 or so epochs, as it gets to a loss value between 0.1 and 0.05. As seen in main.py lines 119-120 I left the code running 400 epochs and the batch size to be 12. This was a good batch size given the memory constraints in my GPU.

Does the project correctly label the road?

The network does appear to correctly identify the road on all pictures with very little bleeding into non-road areas as seen in the following examples.

Example 1:

Example 1

Example 2:

Example 2

Example 3:

Example 3