There is only one more thing to consider now, which is the sample size. Here, we will observe that as the sample size increases, the underlying sampling distribution will approximate a normal distribution even more closely.
As you saw, a sample size of 30 or above is ideal for concluding that the sampling distribution is nearly normal and further inferences can be drawn from it. The following image portrays the sampling distribution’s nature for different sample sizes
Now, let’s summarise the learnings of this demonstration.
To reiterate, here are the properties of the sampling distribution as given by the Central Limit Theorem for any kind of data provided a high number of samples has been taken
- Sampling distribution’s mean (μ¯X) = Population mean (μ),.
- Sampling distribution’s standard deviation (standard error) = σ\√n, and.
- For n > 30, the sampling distribution becomes a normal distribution.
Now that you have understood the properties of Central Limit Theorem, you are well equipped to infer the population mean from the sample mean.
Recall that in the first lecture on samples, we found the mean commute time of 30,000 employees of an office by taking a small sample of 100 employees and finding their mean commute time. This sample’s mean was = 36.6 minutes and its standard deviation was S = 10 minutes.
We then said that this sample mean cannot be taken as the population mean, as there might be some errors in the sampling process. However, we can say that the population mean, i.e., the daily commute time of all 30,000 employees = 36.6 (sample mean) + some margin of error.
Now, you may be thinking that you can use the standard error for the margin of error. However, keep in mind that although the standard error provides a good estimate of this margin of error, you cannot use it in place of the margin of error. To understand why and how you would find the margin of error in that case, let’s move on to the next lecture, where we will use the CLT (central limit theorem) to find the aforementioned margin of error.
Additional Notes
- As mentioned earlier, this code is for demonstration purposes only. However, if you want you can change the target population from ‘Weight’ to ‘Height’ and verify the results as well.