IKH

CIFAR – 10 Classification with Python – III

Let’s continue our experiments further. From the previous experiments, we have learnt that dropouts are pretty useful, batch normalisation somewhat helps improve performance, and L2 regularisation is not very useful on its own (i.e. without dropouts).

Let’s now conduct an experiment with all these thrown in together. After this experiment, let’s conduct another one to test whether adding a new convolutional layer helps improve performance.

Experiment-V: Dropouts after conv layer, L2 in FC, use BN after convolutional layer

Experiment-VI: Add a new convolutional layer to the network. Note that by a ‘convolutional layer’, the professor is referring to a convolutional unit with two sets of Conv2D layers with 128 filters each (we are abusing the terminology a bit here). The code for the additional conv layer is shown below.

Experiment-V: Dropouts after conv layer, L2 in FC, use BN after convolutional layer

  • Train accuracy =  86%, validation accuracy = 83%

Experiment-V: Dropouts after conv layer, L2 in FC, use BN after convolutional layer

  • Train accuracy =  86%, validation accuracy = 83%

Experiment-VI: Add a new convolutional layer to the network

  • Train accuracy =  89%, validation accuracy = 84%

The additional convolutional layer boosted the validation accuracy marginally, but due to increased depth, the training time increased.

Adding Feature Maps

In the previous experiment, we tried to increase the capacity of the model by adding a convolutional layer. Let’s now try adding more feature maps to the same architecture.

Experiment – VII: Add more feature maps to the conv layers: from 32 to 64 and 64 to 128.

You can download the notebook below:

The results of our final experiment are mentioned below.

Experiment-VII: Add more feature maps to the convolutional layers to the network

  • Train accuracy =  89%, validation accuracy = 84%

On adding more feature maps, the model tends to overfit (compared to adding a new convolutional layer). This shows that the task requires learning to extract more (new) abstract features, rather than trying to extract more of the same features.

Report an error