In the past few years, the performance of CNN based architectures such as AlexNet, VGGNet, ResNet etc. has been steadily improving. But while deploying deep learning models in practice, you usually have to consider multiple other parameters apart from just accuracy.
For example, say you’ve built a mobile app which uses a conv net for real-time face detection. Since it will be deployed on smartphones, some of which may have low memory etc., you might be more worried about it working ‘fast enough’, rather than the accuracy.
In the next two segments, we will discuss a recent paper that appeared in 2017- “An Analysis of Deep Neural Network Models for Practical Applications”. This paper compares the popular architectures on multiple metrics related to resource utilization such as accuracy, memory footprint, number of parameters, operations count, inference time and power consumption.
An important point to notice here is that although the VGGNet (VGG-16 and VGG-19) is used widely, it is by far the most expensive architecture-both in terms of the number of operations (and thus computational time) and the number of parameters (and thus memory requirement).
In the next lecture, we will continue to discuss some other results. To summarise, some key points we have discussed are:
- Architectures in a particular cluster, such as GoogleNet, ResNet-18 and ENet, are very attractive since they have small footprints (both memory and time) as well as pretty good accuracies. Because of low-memory footprints, they can be used on mobile devices, and because the number of operations is small, they can also be used in real time inference.
- In some ResNet variants (ResNet-34,50,101,152) and Inception models (Inception-v3,v4), there is a trade-off between model accuracy and efficiency, i,e. the inference time and memory requirement.
- There is a marginal decrease in the (forward) inference time per image with the batch size if you need to.
- Up to a certain batch size, most architectures use a constant memory, after which the consumption increases linearly with the batch size.
In the next segment, we will continue our discussion of the paper.
Report an error