Convolutional Neural Networks Labs

The objective of this lab is to give you hands-on experience designing a convolutional neural network (CNN). These instructions will provide you with a basic CNN architecture, and provide you with default training settings. This sample model has a mediocre performance. Your job is to use your knowledge about model capacity, under/over fitting, and deep network architectures to improve the performance, i.e., reduce the testing error.

Setup your development environment

The code for this lab is in Python and uses the TensorFlow library. These instructions will teach you how to edit and run the code in VS-Code, a commonly used coding editor with debugger, on your computer. You may use any editor and debugger that you like (e.g., Juptyer notebooks, Google Colab, Kagle, etc). However, the instructions will assume you are using vs-code, and the instructor will be unable to provide help for different editors.

Install Python

Download Python and follow the installation instructions for your OS

Install VS-Code

Download VS-Code and follow the installation instructions for your OS.

Install the Python Extension in VS-Code

Open VS-Code
Click on the Extension Icon or use the command Ctrl+Shift+X
Type "python" in the search box
Click on the extension called Python published by Microsoft
Click Install

Additional Resources

If you need more details on how to install Python and the Python extension, please read these guide

Getting the code

After you have installed Python and the Python extension, download the code, save it to a new folder, and then open the folder in VS-Code.

Click on the green button with the Code label on the top right of this page.
Click on Download Zip

Alternatively, you can use git. You can also copy and paste manually from the individual files.

Open the folder in VS-Code

Open VS-code
Click on the Explorer icon or press Ctrl+Shift+E
Click on Open Folder
Select and open the cnn-lab folder.
You should see the lab files displayed on the file explorer column.

Code test drive

Before we start tinkering with the code, let's take a look at the various files, understand what they do, and make sure they work.

The `data-exploration.py` file

This file downloads the CIFAR10 dataset and saves a sample of 25 images. The CIFAR10 dataset is commonly used to measure the performance of a CNN. As with any data science project, we start by doing some data exploration. In this case, we are just going to take a look at a few images.

To run the file,

Open the data-exploration.py file.
Press Ctrl+Shift+D to open the debug icon and then click Run and debug.
Alternatively, you can press Ctrl+F5 to directly run the file.

To open the sample images,

Take a look at line 44.
Open the file with the name specified in line 44.

The `build-cnn-model.py` file

This file builds a CNN model and saves it to a file. This is an important file as all the design choices about the architecture of the model are coded here.

Let's run the file and see the model that it creates.

Run the file by pressing Ctrl+F5
Open the file with the name specified in line 70

Questions

Describe the architecture of the CNN. Provide details about the layers' size, type, and number. Also provide details about the activation functions used.

Answer: We can observe that the CNN is formed by 6 layers.

Layer 1: This is the input layer. It doesn't apply any processing but it tells our model the shape of the inputs. In this case, we have an input with shape None, 32, 32, 3. This means the images are 32 by 32 pixels and have three channels one for each primary color. None allows the model to take any number of images at a time. Recall that we can process multiple inputs simultaneously. We call multiple inputs a batch.

Layer 2: This is the convolutional layer. It takes an input of size None, 32, 32, 3, and outputs a tensor of size None, 30, 30, 32. The input is no longer an image that makes sense to the human eye.

Layer 3: This is the maxpooling layer. Recall that convolutional layers are usually paired with a pooling layer. In this case, we are using maxpooling. The layer reduces the size of the tensor to None, 15, 15, 32.

Layer 4: Next, we add a flatten layer. So far, we have been dealing with tensors (i.e., three-dimensional matrices). Since we are interested in classifying the images, we eventually want to have a vector of neurons of size 10, which is the number of classes. However, instead of jumping to a vector of size 10 we add an additional layer. This increases the model's capacity. The output of this layer is None, 7200. This layer doens't change the values, it simply "rolls out" the input. To see this we can multiply the dimensions of the input: 15x15x32=7200.

Layer 5: This is a dense layer. The dense layer is a traditional feedforward layer where all inputs are connected to all outputs. We set the size of the output to 64.

Layer 6: This layer does the final prediction. It outputs 10 probabilities. Each probability corresponds to one class. Since we are using the SparseCategoricalCrossEntropy, TensorFlow takes a softmax operation to find the largest probability, and the compares that to the label.

The convolution and dense layers are both using the rectilinear activation function.

Here's the architecture figure you should be seeing
Build a two-column table. The first row should have the lines of code that define the layers. The second column should have a screen shot of the corresponding layer in the architecture figure.

Answer: Layer 1 and Layer 2 are defined by model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3))) Layer 3 is defined by model.add(tf.keras.layers.MaxPooling2D((2, 2))) Layer 4 is defined by model.add(tf.keras.layers.Flatten()) Layer 5 is defined by model.add(tf.keras.layers.Dense(64, activation='relu')) Layer 6 is defined by model.add(tf.keras.layers.Dense(10))
1. Draw the input and output tensors and label its sizes for each layer. You should be able to this based on the answer to the first question.

The `training.py` file

Now that we have defined the architecture of the model, we are ready to train it. The training algorithm and parameter choices are coded in this file. The file also runs trains the model and then saves it to a file.

Lets run the file to train the model.

Press Ctrl+F5

Questions

Which variables in the code are used as the training dataset by the function fit?

Answer: train_images, train_labels

Which variables in the code are used as the testing dataset by the function fit?

Answer: test_images, test_labels 3. Is accuracy a good performance measure? Why?

Answer: Accuracy is a good performance measure as long as there is a roughly equal number of *training and testing sample for each category. The accuracy is calculates as correctly predicated sample divided by the total number of predictions. However, if one of the categories has many more samples, then even if we have a poor model that always predicts the dominant category, we would still have high accuracy.

Is categorical cross-entropy the appropriate loss function? Why? Yes. The reason is that we have multiple categories.

The `analysis.py` file

Now that we have a trained model, we can observe the training error, testing error, and accuracy.

Run the file to generate the plots

Press Ctrl+F5
Open the accuracy plot (file name specified in line 22)
Open the loss plot (file name in line 35)

Questions

What is difference between validation and testing in this code?

Answer. The difference between validation and testing is that the data that we are using for testing is not used for training, and vice versa.

What is the difference between training/testing loss and training/testing accuracy?

Answer. The training loss and accuracy are calcualted when the model predicts the label for samples in the training dataset. The testing accuracy and loss are calculated when the model predicts labels in the testing data set.

We are using the categorical cross-entropy (CCE) to measure the loss. Would a larger CCE or smaller CCE result in a lower testing error?

Answer. Our testing error is equal to the CCE itself. So, lower CCE means lower testing error.

What is the accuracy of the model?

Answer. After one epoch, the accuracy is about 0.55. You can see this value in the cnn-training-history.csv file.

Has the training/testing accuracy converged in the accuracy plot?

Answer. We cannot tell because we only had one epoch.

Has the training/testing loss converged in the loss plot?

Answer. We cannot tell because we only had one epoch.

Did the difference between the training and testing loss/accuracy decreased or
increased across epochs? If it increased and decreased in the plot specify in which epochs it increased/decreased?

Answer. We cannot tell because we only had one epoch.

Give the potential reasons that can explain the low accuracy of the model (anything below 90% is considered low)

Answer. Low accuracy can be due to low model capacity, over or under fitting, lack of enough data, too few epochs, incorrect learning rate, etc.

Updating the code to improve performance.

Now that you have a trained model, lets see if we can improve it. We can change the training parameters and the model itself.

Updating the Training Parameters

Based on your answers to Q.5 in the analysis.py file section, make changes to the training parameters to improve the testing accuracy and loss.
Describe your changes.

Answer. There are multiple possible changes to improve the accuracy. The first change that I made was to increase the epohcs parameter in the model.compile function from 1 to 10. After, changing the epochs to a larger value, we can observe the accuracy and loss plots and give a better answer to the questions above. These are the plots I obtained after running 10 epochs.

From these plots, we can observe that the testing error starts to increase around 8 epochs. If we choose, to stay with this model, we would stop the training at 8 epochs. We also see that the accuracy maxes out at around 60%.

Lets see if we can do better. Next, I added a convolutional layer to my network by adding the following lines of code after the original convolutional layer. model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu')) model.add(tf.keras.layers.MaxPooling2D((2, 2)))

I also increased the epochs from 10 to 12 in case the larger model needs more training.

These are the plots I obtained with the new model.

We see that the we can obtained an accuracy of about 70%, which is a 10% increase from the previous model.

We can continue extending the network with more layers to improve the testing accuracy.

Run the training.py and analysis.py files with your new parameters.
Did the testing performance improve? Explain why or why not.

Answer The model performance improved with the model that had an extra convolutional layer. The reason is that the capacity of the model, i.e., the total number of functions as well as their complexity from which it can choose to model the training data, is larger. This allows the model to find a more appropriate function.

CNN Model updates

Based on your answers to Q.5 in the analysis.py file section, make changes to the CNN architecture.
Describe your changes.
Run the training.py and analysis.py files with your new parameters.
Did the testing performance improve? Explain why or why not.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
accuracy-10.png		accuracy-10.png
accuracy-12.png		accuracy-12.png
analysis.py		analysis.py
build-cnn-model.py		build-cnn-model.py
data-exploration.py		data-exploration.py
license.txt		license.txt
loss-10.png		loss-10.png
loss-12.png		loss-12.png
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Convolutional Neural Networks Labs

Setup your development environment

Install Python

Install VS-Code

Install the Python Extension in VS-Code

Additional Resources

Getting the code

Open the folder in VS-Code

Code test drive

The `data-exploration.py` file

The `build-cnn-model.py` file

Questions

The `training.py` file

Questions

The `analysis.py` file

Questions

Updating the code to improve performance.

Updating the Training Parameters

CNN Model updates

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Convolutional Neural Networks Labs

Setup your development environment

Install Python

Install VS-Code

Install the Python Extension in VS-Code

Additional Resources

Getting the code

Open the folder in VS-Code

Code test drive

The data-exploration.py file

The build-cnn-model.py file

Questions

The training.py file

Questions

The analysis.py file

Questions

Updating the code to improve performance.

Updating the Training Parameters

CNN Model updates

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

The `data-exploration.py` file

The `build-cnn-model.py` file

The `training.py` file

The `analysis.py` file

Packages