IKH

Introduction to Variance

In the previous session, you learnt about the first fundamental building block for learning PCA – the idea of basis and the change of basis. You saw how a simple change of basis led to dimensionality reduction in the case of the roadmap example and then understood how you can represent the same data in multiple basis vectors.

However, we didn’t know as to how to find those “ideal basis vectors” and what exact properties they must satisfy. In this session, we’ll get to do that by understanding the idea of variance as information.

As mentioned previously, you have already learnt certain methods through which you delete columns – by checking the number of null values, unnecessary information and in modelling by checking the p-values and VIF scores.

PCA gauges the importance of a column by another metric called ‘variance’ or how varied a column’s values are.
 

Let’s go ahead and look at some examples in the next segment and get an intuitive idea of what variance actually means.

Report an error