Statistics on Numerical Features

You have seen how to conduct univariate analysis on categorical variables. Now, let’s look at quantitative or numeric variables.

Numeric variables can be continuous like height, temperature, weight, etc. Numerical variables can also be discrete like the number of items bought by a customer in a store, the number of people in a city, the number of ‘heads’ you get when flipping three coins.

In this segment, our expert Anand will take you through various statistical metrics such as mean, median, mode and standard deviation.

Let’s now learn how to analyse quantitative variables.

Mean and median are single values that broadly give a representation of the entire data. As Anand states clearly, it is very important to understand when to use these metrics to avoid inaccurate analysis.

While ‘mean‘ gives an average of all the values, the ‘median‘ gives a typical value that can be used to represent the entire group. As a simple rule of thumb, always question someone if they use ‘mean’ since ‘median’ is primarily a better measure of ‘representativeness’.

Let’s now look at some other descriptive statistics such as mode, interquartile distance, standard deviation, etc.

Both standard deviation and interquartile difference are used to represent the spread of the data.

The interquartile difference is a much better metric than standard deviation if there are outliers in the data because the standard deviation will be influenced by outliers, while the interquartile difference will simply ignore them.

You also saw how box plots are used to understand the spread of data.


Report an error