The objective of this lab is to give you hands-on experience designing a convolutional neural network (CNN). These instructions will provide you with a basic CNN architecture, and provide you with default training settings. This sample model has a mediocre performance. Your job is to use your knowledge about model capacity, under/over fitting, and deep network architectures to improve the performance, i.e., reduce the testing error.
The code for this lab is in Python and uses the TensorFlow library. These instructions will teach you how to edit and run the code in VS-Code, a commonly used coding editor with debugger, on your computer. You may use any editor and debugger that you like (e.g., Juptyer notebooks, Google Colab, Kagle, etc). However, the instructions will assume you are using vs-code, and the instructor will be unable to provide help for different editors.
Download Python and follow the installation instructions for your OS
Download VS-Code and follow the installation instructions for your OS.
- Open VS-Code
- Click on the Extension Icon or use the command
Ctrl+Shift+X - Type "python" in the search box
- Click on the extension called
Pythonpublished by Microsoft - Click Install
If you need more details on how to install Python and the Python extension, please read these guide
After you have installed Python and the Python extension, download the code, save it to a new folder, and then open the folder in VS-Code.
- Click on the green button with the
Codelabel on the top right of this page. - Click on
Download Zip
Alternatively, you can use git. You can also copy and paste manually from the individual files.
- Open VS-code
- Click on the Explorer icon or press
Ctrl+Shift+E - Click on
Open Folder - Select and open the
cnn-labfolder. - You should see the lab files displayed on the file explorer column.
Before we start tinkering with the code, let's take a look at the various files, understand what they do, and make sure they work.
This file downloads the CIFAR10 dataset and saves a sample of 25 images. The CIFAR10 dataset is commonly used to measure the performance of a CNN. As with any data science project, we start by doing some data exploration. In this case, we are just going to take a look at a few images.
To run the file,
- Open the
data-exploration.pyfile. - Press
Ctrl+Shift+Dto open the debug icon and then clickRun and debug. - Alternatively, you can press
Ctrl+F5to directly run the file.
To open the sample images,
- Take a look at line 44.
- Open the file with the name specified in line 44.
This file builds a CNN model and saves it to a file. This is an important file as all the design choices about the architecture of the model are coded here.
Let's run the file and see the model that it creates.
- Run the file by pressing
Ctrl+F5 - Open the file with the name specified in line 70
-
Describe the architecture of the CNN. Provide details about the layers' size, type, and number. Also provide details about the activation functions used.
Answer: We can observe that the CNN is formed by 6 layers.
Layer 1: This is the input layer. It doesn't apply any processing but it tells our model the shape of the inputs. In this case, we have an input with shape None, 32, 32, 3. This means the images are 32 by 32 pixels and have three channels one for each primary color. None allows the model to take any number of images at a time. Recall that we can process multiple inputs simultaneously. We call multiple inputs a batch.
Layer 2: This is the convolutional layer. It takes an input of size None, 32, 32, 3, and outputs a tensor of size None, 30, 30, 32. The input is no longer an image that makes sense to the human eye.
Layer 3: This is the maxpooling layer. Recall that convolutional layers are usually paired with a pooling layer. In this case, we are using maxpooling. The layer reduces the size of the tensor to None, 15, 15, 32.
Layer 4: Next, we add a flatten layer. So far, we have been dealing with tensors (i.e., three-dimensional matrices). Since we are interested in classifying the images, we eventually want to have a vector of neurons of size 10, which is the number of classes. However, instead of jumping to a vector of size 10 we add an additional layer. This increases the model's capacity. The output of this layer is None, 7200. This layer doens't change the values, it simply "rolls out" the input. To see this we can multiply the dimensions of the input: 15x15x32=7200.
Layer 5: This is a dense layer. The dense layer is a traditional feedforward layer where all inputs are connected to all outputs. We set the size of the output to 64.
Layer 6: This layer does the final prediction. It outputs 10 probabilities. Each probability corresponds to one class. Since we are using the
SparseCategoricalCrossEntropy, TensorFlow takes a softmax operation to find the largest probability, and the compares that to the label.The convolution and dense layers are both using the rectilinear activation function.
Here's the architecture figure you should be seeing
-
Build a two-column table. The first row should have the lines of code that define the layers. The second column should have a screen shot of the corresponding layer in the architecture figure.
Answer: Layer 1 and Layer 2 are defined by
model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))Layer 3 is defined bymodel.add(tf.keras.layers.MaxPooling2D((2, 2)))Layer 4 is defined bymodel.add(tf.keras.layers.Flatten())Layer 5 is defined bymodel.add(tf.keras.layers.Dense(64, activation='relu'))Layer 6 is defined bymodel.add(tf.keras.layers.Dense(10))- Draw the input and output tensors and label its sizes for each layer. You should be able to this based on the answer to the first question.
Now that we have defined the architecture of the model, we are ready to train it. The training algorithm and parameter choices are coded in this file. The file also runs trains the model and then saves it to a file.
Lets run the file to train the model.
- Press
Ctrl+F5
- Which variables in the code are used as the training dataset by the function
fit?
Answer: train_images, train_labels
- Which variables in the code are used as the testing dataset by the function
fit?
Answer: test_images, test_labels
3. Is accuracy a good performance measure? Why?
Answer: Accuracy is a good performance measure as long as there is a roughly equal number of *training and testing sample for each category. The accuracy is calculates as correctly predicated sample divided by the total number of predictions. However, if one of the categories has many more samples, then even if we have a poor model that always predicts the dominant category, we would still have high accuracy.
- Is categorical cross-entropy the appropriate loss function? Why? Yes. The reason is that we have multiple categories.
Now that we have a trained model, we can observe the training error, testing error, and accuracy.
Run the file to generate the plots
- Press
Ctrl+F5 - Open the accuracy plot (file name specified in line 22)
- Open the loss plot (file name in line 35)
- What is difference between validation and testing in this code?
Answer. The difference between validation and testing is that the data that we are using for testing is not used for training, and vice versa.
- What is the difference between training/testing loss and training/testing accuracy?
Answer. The training loss and accuracy are calcualted when the model predicts the label for samples in the training dataset. The testing accuracy and loss are calculated when the model predicts labels in the testing data set.
- We are using the categorical cross-entropy (CCE) to measure the loss. Would a larger CCE or smaller CCE result in a lower testing error?
Answer. Our testing error is equal to the CCE itself. So, lower CCE means lower testing error.
- What is the accuracy of the model?
Answer. After one epoch, the accuracy is about 0.55. You can see this value in the
cnn-training-history.csv file.
- Has the training/testing accuracy converged in the accuracy plot?
Answer. We cannot tell because we only had one epoch.
- Has the training/testing loss converged in the loss plot?
Answer. We cannot tell because we only had one epoch.
- Did the difference between the training and testing loss/accuracy decreased or
increased across epochs? If it increased and decreased in the plot specify in which epochs it increased/decreased?
Answer. We cannot tell because we only had one epoch.
- Give the potential reasons that can explain the low accuracy of the model (anything below 90% is considered low)
Answer. Low accuracy can be due to low model capacity, over or under fitting, lack of enough data, too few epochs, incorrect learning rate, etc.
Now that you have a trained model, lets see if we can improve it. We can change the training parameters and the model itself.
- Based on your answers to Q.5 in the
analysis.pyfile section, make changes to the training parameters to improve the testing accuracy and loss. - Describe your changes.
Answer. There are multiple possible changes to improve the accuracy. The first change that I
made was to increase the epohcs parameter in the model.compile function from 1 to 10. After,
changing the epochs to a larger value, we can observe the accuracy and loss plots and give a better
answer to the questions above. These are the plots I obtained after running 10 epochs.
From these plots, we can observe that the testing error starts to increase around 8 epochs. If we choose, to stay with this model, we would stop the training at 8 epochs. We also see that the accuracy maxes out at around 60%.
Lets see if we can do better. Next, I added a convolutional layer to my network by adding the following lines of code after the
original convolutional layer.
model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
I also increased the epochs from 10 to 12 in case the larger model needs more training.
These are the plots I obtained with the new model.

We see that the we can obtained an accuracy of about 70%, which is a 10% increase from the previous model.
We can continue extending the network with more layers to improve the testing accuracy.
- Run the
training.pyandanalysis.pyfiles with your new parameters. - Did the testing performance improve? Explain why or why not.
Answer The model performance improved with the model that had an extra convolutional layer. The reason is that the capacity of the model, i.e., the total number of functions as well as their complexity from which it can choose to model the training data, is larger. This allows the model to find a more appropriate function.
- Based on your answers to Q.5 in the
analysis.pyfile section, make changes to the CNN architecture. - Describe your changes.
- Run the
training.pyandanalysis.pyfiles with your new parameters. - Did the testing performance improve? Explain why or why not.


