Practical Implementation of Transfer Learning

There are two main ways of using pre-trained nets for transfer learning:

Freeze the (weights of) initial few layers and training only a few latter layers.
Retrain the entire network (all the weights) initializing from the learned weights.

Lets’s look at these two techniques in detail. Thus, you have the following two ways of training a pre-trained.

‘Freeze‘ the initial layers, i.e. use the same weights and biases that the network has learnt from some other task, remove the few last layers of the pre-trained model, add your own new layer(s) at the end and train only the newly added layer(s).
Retrain all the weights starting (initializing) from the weights and biases that the net has already learnt. Since you don’t want to unlearn a large fraction of what the pre-trained layers have learnt. So, for the initial layers, we will choose a low learning rate.

When you implement transfer learning practically, you will need to take some decisions such as how many layers of the pre-trained network to throw away and train yourself. Let’s see how one can answer these questions.

To summaries:

If the task is a lot similar to that the pre-trained model had learnt from, you can use most of the layers except the last few layers which you can retrain.
If you think there is less similarity in the tasks, you can use only a few initial trained weights for your task.

In the next segment, you will see a demonstration of Transfer Learning in python + Kera’s.

Report an error