First, you learnt how you can use your knowledge of the CLT to infer the population mean from the sample mean.
We estimated the mean commute time of 30,000 employees of an office by taking a sample of 100 employees, finding their mean commute time. Specifically, you were given a sample with a sample mean ¯X = 36.6 minutes and a sample standard deviation S = 10 minutes.
Using the CLT, you concluded that the sampling distribution for the mean commute time would have the following:
- Mean = μ {unknown}
- Standard error = σ√n≈S√n=10√100=1
- Since n(100) > 30, the sampling distribution is a normal distribution.
Using these properties, you were able to claim that the probability that the population mean μ lies between 34.6 (36.6 – 2) and 38.6 (36.6 + 2) is 95.4%.
Then, you learnt the following terminology related to the claim:
- The probability associated with the claim is called the confidence level. (Here, it is 95.4%.).
- The maximum error made in a sample mean is called the margin of error. (Here, it is 2 minutes.).
- The final interval of values is called the confidence interval. [Here, it is the range (34.6, 38.6).].
You then generalised the whole process. Let’s say you have a sample with a sample size n, mean and standard deviation S. You learnt that a y% confidence interval (i.e., a confidence interval corresponding to a y% confidence level) for will be given by the range:
Confidence interval = ,
Where, Z* is the Z-score associated with a y% confidence level.