Confidence Intervals for the Mean (Sigma Unknown): Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Confidence Intervals

Introduction to Confidence Intervals

Confidence intervals are a fundamental concept in inferential statistics, providing a range of values within which a population parameter is likely to lie. They are used to estimate population means, proportions, variances, and standard deviations based on sample data.

Confidence Level: The probability that the interval estimate contains the true population parameter (commonly 90%, 95%, or 99%).
Margin of Error: The maximum expected difference between the true population parameter and a sample estimate.

Confidence Intervals for the Mean (Sigma Unknown)

The t-Distribution

When the population standard deviation (σ) is unknown, the sample standard deviation (s) is used to estimate it. In this case, the sampling distribution of the sample mean follows a t-distribution rather than a normal distribution.

t-distribution: A probability distribution used when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.
Critical values of t: Denoted as tc, these values are used to construct confidence intervals.

Properties of the t-Distribution

The mean, median, and mode are all equal to 0.
It is bell-shaped and symmetric about the mean.
The total area under the curve is 1.
The t-distribution has thicker tails than the standard normal distribution, making it more prone to producing values that fall far from its mean.
The standard deviation varies with sample size and is greater than 1.
It is a family of curves, each determined by the degrees of freedom (d.f.), which is n - 1 for a sample of size n.
As the degrees of freedom increase, the t-distribution approaches the normal distribution. For d.f. ≥ 30, the t-distribution is very close to the standard normal distribution.

Finding Critical Values of t

To find the critical value tc for a given confidence level and sample size:

Determine the degrees of freedom: d.f. = n - 1.
Use a t-distribution table or technology to find the value corresponding to the desired confidence level and degrees of freedom.

Example: For a 95% confidence interval with a sample size of 15, d.f. = 14. Use the t-table to find tc for c = 0.95 and d.f. = 14.

Constructing a Confidence Interval for a Population Mean (Sigma Unknown)

When the population standard deviation is unknown, the confidence interval for the population mean is constructed as follows:

Sample mean: \( \overline{x} \)
Sample standard deviation: s
Sample size: n
Degrees of freedom: n - 1
Critical value from t-distribution: tc

The confidence interval is given by:

Margin of Error:

Example 1: Constructing a 95% Confidence Interval

You randomly select 16 coffee shops and measure the temperature of the coffee sold at each. The sample mean temperature is \( \overline{x} \) with a sample standard deviation of s = 10.0. Construct a 95% confidence interval for the population mean temperature, assuming normality.

n = 16, s = 10.0, c = 0.95, d.f. = 15
Find tc from the t-table for d.f. = 15 and c = 0.95
Calculate the margin of error and endpoints using the formula above.

Example 2: Constructing a 99% Confidence Interval

You randomly select 36 cars of the same model and determine the number of days each car was on the dealership’s lot. The sample mean is 9.75 days, with a sample standard deviation of 2.39 days. Construct a 99% confidence interval for the population mean number of days.

n = 36, s = 2.39, c = 0.99, d.f. = 35
Find tc from the t-table for d.f. = 35 and c = 0.99
Calculate the margin of error and endpoints using the formula above.
Result: With 99% confidence, the interval between 8.66 and 10.84 days contains the population mean.

Choosing Between the Normal and t-Distribution

When constructing confidence intervals for the mean, the choice between the normal (z) and t-distribution depends on the following:

If the population standard deviation (σ) is known and the population is normally distributed (or n ≥ 30), use the standard normal (z) distribution.
If σ is unknown and the population is normally distributed (or n ≥ 30), use the t-distribution.

Example: For a sample of 25 houses with a known population standard deviation, use the standard normal distribution, even though n < 30, because the population is normally distributed and σ is known.

Summary Table: When to Use the Normal vs. t-Distribution

Condition	Distribution to Use
Population standard deviation known, population normal or n ≥ 30	Standard normal (z)
Population standard deviation unknown, population normal or n ≥ 30	t-distribution
Population not normal, n < 30, σ unknown	Neither (cannot construct interval)

Applications and Interpretation

Confidence intervals are widely used in scientific research, business, and quality control to estimate population parameters and assess the reliability of sample statistics.
Interpretation: A 95% confidence interval means that if the same population is sampled repeatedly and intervals are constructed, approximately 95% of those intervals will contain the true population mean.

Additional info: The t-distribution is especially important for small sample sizes and when the population standard deviation is not available, which is common in practical applications.