IKH

Overview of CNN Architectures

In this session, we will take an overview of some of the most popular CNN architectures which have set the benchmark for state-of-the-art results in computer vision tasks. The acid test for almost CNN-based architectures has been the ImageNet Large Scale Visual Recognition Competition (ILSVRC), or simply ImageNet. The dataset contains roughly 1.2 million training images, 50,000 validation images, and 150,000 testing images of about 1000 classes.

We will discuss the following architectures in this session:

  • AlexNet
  • VGGNet
  • GoogleNet
  • ResNet

In the video given below, the professor will provide a quick overview of some popular CNN-based architectures.

To summarise the important points:

  • The depth of the state-of-the-art neural networks has been steadily increasing (from AlexNet with 8 layers to ResNet with 152 layers).
  • The developments in neural net architectures were made possible by significant advancements in infrastructure. For example, many of these networks were trained on multi GPUs in a distributed manner.
  • Since these networks have been trained on millions of images, they are good at extracting generic features from a large variety of images. Thus, they are now commonly being used as commodities by deep learning practitioners around the world.

You will learn to use large pre-trained networks in the next section on transfer learning. In the next segment, we will study the architectures of AlexNet, VGGNet and GoogleNet.

Report an error