Estimating a Population Mean: Confidence Intervals and the t-Distribution

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Estimating a Population Mean

Introduction to Confidence Intervals for the Mean

Estimating the population mean (μ) is a fundamental task in inferential statistics. Confidence intervals provide a range of plausible values for the population mean based on sample data. The method used depends on whether the population standard deviation (σ) is known or unknown.

Estimating a Mean When σ is Known

Confidence Interval Formula (σ Known)

When the population standard deviation is known (a rare scenario in practice), the confidence interval for the mean is constructed using the normal (z) distribution. The formula for the margin of error (E) is:

Margin of Error (E):
Confidence Interval:

Where is the critical value from the standard normal distribution for the desired confidence level, is the population standard deviation, is the sample size, and is the sample mean.

Sample Size for Estimating Mean (σ Known)

To achieve a desired margin of error, the required sample size can be calculated as:

Where is the desired margin of error.

Sample size formula for estimating mean

Example: Blood Pressure Levels

Suppose 14 medical students measured the blood pressure of the same person, and the population standard deviation is known to be 10 mmHg. The sample data and summary statistics are as follows:

Blood pressure data example Sample statistics for blood pressure data Histogram of blood pressure data

Estimating a Mean When σ is Unknown (Most Common Case)

Using the t-Distribution

In practice, the population standard deviation (σ) is rarely known. Instead, the sample standard deviation (s) is used, introducing additional uncertainty. The confidence interval is then constructed using the t-distribution:

Margin of Error (E):
Confidence Interval:

Where is the critical value from the t-distribution with degrees of freedom.

Properties of the t-Distribution

Symmetric about mean = 0
Wider than the z-distribution (greater variability, especially for small n)
Shape depends on sample size (degrees of freedom, df = n - 1)
As n increases, the t-distribution approaches the normal distribution

Historical Note: The Student's t-Distribution

The t-distribution was introduced by Sir William Gosset, who published under the pseudonym "Student" while working at the Guinness Brewery. His work highlighted the need for a larger critical value to account for the extra uncertainty when σ is unknown.

Historical figure associated with the t-distribution

Finding the t Critical Value

The t critical value depends on the desired confidence level and the degrees of freedom. For example, for a 95% confidence interval with 23 degrees of freedom, the critical value is approximately 2.069.

Finding t critical value for 95% CI and 23 df

Requirements for t-Intervals

To use the t-interval for estimating a mean, the following requirements must be met:

The sample is a simple random sample (all samples of the same size have an equal chance of being selected).
The value of the population standard deviation is unknown.
Either the population is normally distributed, or the sample size is large (n > 30), or the sample data do not show strong departures from normality.

Requirements for t-intervals

Choosing Between z and t

The following table summarizes when to use the z or t distribution, or a nonparametric method:

Method	Conditions
Use normal (z) distribution	σ known and normally distributed population or σ known and n > 30
Use t distribution	σ not known and normally distributed population or σ not known and n > 30
Use a nonparametric method or bootstrapping	Population is not normally distributed and n ≤ 30

Table for choosing between z and t

Worked Example: Three-Minute Hourglass Timer

Constructing a 95% Confidence Interval

Suppose you measure the time for a three-minute hourglass and want to construct a 95% confidence interval for the average time. The process involves:

Calculating the sample mean and standard deviation from repeated measurements.
Determining the appropriate t critical value for the sample size.
Computing the margin of error and constructing the interval.

Three-minute hourglass timer

Summary Table: Descriptive Statistics Example

Descriptive statistics such as mean, standard deviation, skewness, and kurtosis are often calculated as part of the estimation process. These values help assess the distribution and suitability of the data for confidence interval estimation.

Descriptive statistics table

Visualizing Data: Histogram

Histograms are useful for visualizing the distribution of sample data, which is important for checking the normality assumption required for t-intervals.

Histogram of sample data

Key Takeaways

Use the z-interval when σ is known and the population is normal or n is large.
Use the t-interval when σ is unknown and the population is normal or n is large.
Check normality before applying t-intervals, especially for small samples.
Sample size calculations are essential for achieving desired precision in estimates.