Now that our data is ready and prepared, let’s get into building the model.
The data class is highly imbalanced. The ratio of ‘effusion’ vs ‘nofinding’ is almost 10 (107/1000). As most of the data belongs to only one class, simply training in this scenario will not work as the model will learn mostly learn and classify most of the data as ‘nofinding’ resulting in high accuracy. If you notice, around 90 per cent (1000/1107) of the data is ‘nofinding’ and if it classifies all the data as same, the accuracy will be 90 per cent which is close to 87 per cent accuracy which we have got. So, the objective to correctly classify the ‘effusion’ is not fulfilled. the high accuracy clearly misleads us and therefore we will use AUG to validate the result.
To recall, the basic steps to build the model will remain the same as you have seen in the previous session :
- Import the resnet code (same one we used in the last session)
- Run the augmented data generator
- Perform an ablation Run
- Fit the model
Refer to the notebook to answer the following questions.
Finally, let’s use validation AUG instead of accuracy as evaluation metrics and train the model keeping everything same such as network layers, data augmentation, pre-processing etc.
Let’s quickly recap the important concepts.
- The model is not performing very well on AUG, the measure we had chosen.
- The main reason for this is the prevalence problem. There are just not as many abnormal cases available in the dataset. This problem will occur in almost all medical imaging problems (and for that matter, in most datasets that have a class imbalance)
- To tackle this problem, we introduced ‘weighted categorical cross-entropy’. This is a measure of loss, which applies weights to different forms of errors.
Weighted Cross- Entropy
A common solution to the low prevalence rate problem is using a weighted cross-entropy loss. The loss is modified such that misclassifications of the low-prevalence class are penalised more heavily than the other class.
Therefore, every time the model makes an error on the abnormal class (in this case, ‘effusion’), we penalise it heavily by multiplying the loss by a high value of weights. This results in an increase in loss for misclassified classes and therefore the change in weights due to backpropagation is more. So, the learning curve for the weights responsible for misclassification is more.
Let’s say “no finding” is class 0 and “effusion” is class 1.
bin_weights[0,0]: Actual class: 0, Predicted class: 0, so no penalty, just normal weight of 1.
bin_weights[1,1]: Actual class: 1, Predicted class: 1, so no penalty, just normal weight of 1.
In case of abnormality:
bin_weights[1,0] – Actual class is 1, Predicted class is 0, penalise by weight of 5.
bin_weights[0,1] – Actual class is 0, Predicted class is 1, penalise by weight of 5.
Additional Reading
- We turned to weighted cross-entropy when our neutral network was not performing well. Here’s an article that lists down several reasons why your Neutral Network may not be performing well.
Report an error