IKH

Estimating Mean Using CLT

Now that you have gone through the mid-session summary, let’s get back to the rest of the session. Earlier, we tried to estimate the mean commute time of 30,000 employees of an office by taking a small sample of 100 employees and finding their mean commute time. This sample’s mean was \bar{X} = 36.6 minutes and its standard deviation was S = 10 minutes.

Recall that we also said that the population mean, i.e., the daily commute time of all 30,000 employees (μ) = 36.6 (sample mean) + some margin of error.

You can find this margin of error using the CLT (central limit theorem). Now that you know the CLT, let’s see how you can find the margin of error.

To summarise, let’s say that you have a sample with sample size n, mean \bar{X} and standard deviation S. Now, the y% confidence interval (i.e., the confidence interval corresponding to a y% confidence level) for \mu would be given by the range:

Confidence interval = (\bar{X}-\frac{Z^{*}S}{\sqrt{n}}, \bar{X}+\frac{Z^{*}S}{\sqrt{n}}),

  • where, Z* is the Z-score associated with a y% confidence level. In other words, the
  • population mean and the sample mean differ by a margin of error given by 

Some commonly used Z* values are given below:

Figure 6 - Z* Values for Commonly Used Confidence Levels

Figure 6 – Z* Values for Commonly Used Confidence Levels

At this point, it is important to address a common misconception. Sampling distributions are just a theoretical exercise; you’re not actually expected to make one in real life. If you want to estimate the population mean, you will just take a sample. You will not create an entire sampling distribution.

You must be wondering why you studied sampling distributions if this is the case. To understand the reason for this, let’s go through the actual process of sampling. Recall that you are doing sampling because you want to find the population mean, albeit in the form of an interval. The three steps to follow are as follows:

  • First, take a sample of size n.
  • Then, find the mean ¯X and standard deviation S of this sample.
  • Now, you can say that for a y% confidence level, the confidence interval for the population mean is given by 

However, as you may have seen in the video above, you cannot finish step 3 without the CLT. The CLT lets you assume that the sample mean would be normally distributed, with mean 

\mu

 and standard deviation

\frac{\sigma}{\sqrt{n}}

(approx.

\frac{S}{\sqrt{n}}

). Using this assumption, it is possible to find the margin of error, confidence interval, etc.

Thus, you learnt about sampling distributions so that you could learn more about the CLT and be able to make all the assumptions as stated above.

FREQUENTLY ASKED QUESTIONS (FAQ)

Report an error