There are two main ways of using pre-trained nets for transfer learning:
- Freeze the (weights of) initial few layers and training only a few latter layers.
- Retrain the entire network (all the weights) initializing from the learned weights.
Lets’s look at these two techniques in detail. Thus, you have the following two ways of training a pre-trained.
- ‘Freeze‘ the initial layers, i.e. use the same weights and biases that the network has learnt from some other task, remove the few last layers of the pre-trained model, add your own new layer(s) at the end and train only the newly added layer(s).
- Retrain all the weights starting (initializing) from the weights and biases that the net has already learnt. Since you don’t want to unlearn a large fraction of what the pre-trained layers have learnt. So, for the initial layers, we will choose a low learning rate.
When you implement transfer learning practically, you will need to take some decisions such as how many layers of the pre-trained network to throw away and train yourself. Let’s see how one can answer these questions.
To summaries:
- If the task is a lot similar to that the pre-trained model had learnt from, you can use most of the layers except the last few layers which you can retrain.
- If you think there is less similarity in the tasks, you can use only a few initial trained weights for your task.
In the next segment, you will see a demonstration of Transfer Learning in python + Kera’s.
Report an error