The objective of this project is to label the pixels of a road in images using a completely convolutional network (FCN)
For this, it is necessary to have installed the following frameworks and packages:
Download the Kitti Road dataset from here. Extract the dataset in the data folder. This will create the folder data_road with all the training a test images.
Run the following command to run the project:
python main.py
Note If running this in Jupyter Notebook system messages, such as those regarding test status, may appear in the terminal rather than the notebook.
- The link for the frozen
VGG16model is hardcoded intohelper.py. The model can be found here - The model is not vanilla
VGG16, but a fully convolutional version, which already contains the 1x1 convolutions to replace the fully connected layers. Please see this forum post for more information. A summary of additional points, follow. - The original FCN-8s was trained in stages. The authors later uploaded a version that was trained all at once to their GitHub repo. The version in the GitHub repo has one important difference: The outputs of pooling layers 3 and 4 are scaled before they are fed into the 1x1 convolutions. As a result, some students have found that the model learns much better with the scaling layers included. The model may not converge substantially faster, but may reach a higher IoU and accuracy.
- When adding l2-regularization, setting a regularizer in the arguments of the
tf.layersis not enough. Regularization loss terms must be manually added to your loss function. otherwise regularization is not implemented.