IKH

Introduction

Welcome to the module on ‘Principal Component Analysis’. 

Principal component analysis (PCA) is one of the most commonly used dimensionality reduction techniques in the industry. By converting large data sets into smaller ones containing fewer variables, it helps in improving model performance, visualising complex data sets, and in many more areas. 

In this module

Let’s hear from your SME Mirza Rahim Baig as he introduces the topic of PCA

Now let’s get to know what you’ll be studying in this module and what are the necessary pre-requisites for the same.
As explained in the aforementioned video, the  entire module has been divided into the following main sections:

  • Fundamentals of PCA: Here, you will get an idea of why you should learn about PCA and its essential building blocks before understanding the process. This has been divided to 2 sub-sessions
    • Fundamentals of PCA I 
    • Fundamentals of PCA II
  • PCA Using Python:  Here, you will implement PCA using Python and get to know its various applications.

Prerequisites

This module requires prior knowledge of certain linear algebra concepts, such as matrices, vectors, etc. You will get to know about those prerequisites, along with a brief overview of each, as you go through the sessions. You can also learn the same from the additional module on ‘Maths for Data Analysis’, which contains some useful additional content and questions to improve your understanding of these concepts. Here is a checklist of the concepts that you need to know to understand this module:

  • Vectors and their properties
  • Vector operations (addition, scaling, linear combination and dot product)
  • Matrices 
  • Matrix operations (matrix multiplication and matrix inverses)

In this session

First, in order to fully appreciate PCA’s usefulness, you will look at a wide variety of situations – some of which you may have encountered in your earlier modules, like the multicollinearity problem and how PCA helps us solve it. Then, you will learn the basic definition of PCA, followed by a brief introduction to linear algebra topics that are crucial for understanding PCA and its building blocks. After this, you will look at two key ideas that form the workings of PCA: change of basis and variance as information.

Guidelines for in-module questions

The in-video and in-content questions for this module are not graded. The graded questions are given in a separate segment labelled ‘Graded Questions’ at the end of this session. The questions in that segment will adhere to the following guidelines:

People you’ll hear from in this module

Analytics Lead at Flipkart

Seasoned advanced analytics/ data science professional with 10 years of experience in advanced analytics, machine learning, consulting in the e-commerce and healthcare domains. Proficient in machine learning and applications; adept at solving complex problems through data. 

Report an error