Skip to main content
Back

Lecture 27: Confidence Intervals for Mean, Variance, and Standard Deviation (Student's t and Chi-Square Distributions)

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Confidence Intervals for Population Mean

Introduction

Confidence intervals provide a range of plausible values for population parameters, such as the mean, based on sample data. When the population standard deviation (σ) is unknown and the sample size (n) is small, the Student's t-distribution is used instead of the normal distribution.

Estimating the Mean When σ is Unknown

  • Point Estimator: The sample standard deviation (s) is used to estimate the unknown population standard deviation (σ).

  • Formula for Sample Standard Deviation:

  • When σ is unknown and n < 30, the sampling distribution of the sample mean (\overline{X}) is not normal, but follows the Student's t-distribution.

Student's t-Distribution

Definition and Properties

The Student's t-distribution is used when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown.

  • Degrees of Freedom (df): The t-distribution depends on the degrees of freedom, calculated as df = n - 1.

  • Symmetry: The t-distribution is centered at 0 and is symmetric about 0.

  • Area Under the Curve: The total area is 1; the area to the right of 0 equals the area to the left of 0, each being 0.5.

  • Tails: The t-distribution has fatter tails than the standard normal distribution, reflecting greater variability due to estimating σ with s.

  • Convergence: As n increases, the t-distribution approaches the standard normal distribution.

Formula for t-Statistic

Critical Values of t-Distribution

  • Notation: is the t-value with area α to its right under the t-distribution with df = n - 1.

  • Critical values are found using t-tables for the desired confidence level and degrees of freedom.

Confidence Interval Formula (t-Distribution)

  • This gives a (1-α)100% confidence interval for the population mean μ when σ is unknown and n < 30.

Example: Confidence Interval Calculation

  • Sample: n = 20, ,

  • 95% CI:

  • 98% CI:

Summary Table: Confidence Intervals for Mean

Sample Size

σ Known

σ Unknown

Large (n ≥ 30)

Small (n < 30)

Margin of Error

Definition

  • The Margin of Error (ME) quantifies the degree of precision of a confidence interval.

  • Formula (when σ is known):

Sample Size Determination

Introduction

Determining the minimum sample size required to achieve a desired margin of error is crucial in designing experiments.

  • Depends on population standard deviation (σ), desired margin of error (E), and confidence level (1-α).

  • Formula:

  • Round up to the next integer.

Example

  • Given: , ,

Estimating Standard Deviation

Point Estimator

  • When σ is unknown, estimate it using the sample standard deviation (s):

Chi-Square Distribution

Definition

  • If a simple random sample of size n is obtained from a normally distributed population with mean μ and standard deviation σ, then:

  • The chi-square distribution has n-1 degrees of freedom.

Characteristics

  • Not symmetric.

  • Shape depends on degrees of freedom; becomes more symmetric as degrees of freedom increase.

  • Values are nonnegative (≥ 0).

Critical Values

  • Critical value is the value with area α to its right under the chi-square curve.

  • Found using chi-square tables for the desired confidence level and degrees of freedom.

Confidence Intervals for Variance and Standard Deviation

Formula for Variance

  • For a (1-α)100% confidence interval for σ²:

Bound

Formula

Lower

Upper

  • To find the confidence interval for σ, take the square root of the lower and upper bounds.

Example: Confidence Interval for Variance and Standard Deviation

  • Sample: n = 12,

  • Degrees of freedom:

  • Critical values: ,

  • Lower bound for variance:

  • Upper bound for variance:

  • Lower bound for standard deviation:

  • Upper bound for standard deviation:

Assumptions and Cautions

  • Confidence intervals for variance and standard deviation are not of the form "point estimate ± margin of error" due to the non-symmetric sampling distribution.

  • Methods require data from a normal distribution. If normality is not satisfied, do not use these methods.

Normality Condition and Robustness

Checking Normality

  • For n < 30: Use normal probability plots and boxplots to check for normality and outliers.

  • If data are approximately normal and have no outliers, use the t-distribution for confidence intervals.

Robustness

  • For n < 15: Use t-distribution if data are symmetric and have no outliers.

  • For 15 ≤ n < 30: Use t-distribution if data do not have extreme skewness and no outliers.

  • For n ≥ 30: Use t-distribution even for skewed distributions, relying on the Central Limit Theorem and Law of Large Numbers.

Trivia: Origin of Student's t-Distribution

The Student's t-distribution was discovered by William Sealy Gosset, who published under the pen name "Student" while working at the Guinness brewery. The letter t for the statistic was popularized later by R. A. Fisher.

Summary Table: Critical Values of Chi-Square Distribution

Degrees of Freedom

Area to the Right (0.05)

Area to the Right (0.025)

Area to the Right (0.01)

11

19.675

22.618

25.877

13

22.36

25.688

29.819

18

28.869

31.526

34.805

Additional info: Table values inferred from provided images and context.

Example: Finding Chi-Square Critical Values

  • For 18 degrees of freedom, the values that separate the middle 95% of the distribution from the remaining 2.5% in each tail are:

Options for Using t-Distribution

  • Option 1 (Preferred): For n < 30, check normality and outliers using plots. If data are approximately normal and have no outliers, use t-distribution.

  • Option 2 (Robustness): For n < 15, use t-distribution if data are symmetric and have no outliers. For 15 ≤ n < 30, use t-distribution if data do not have extreme skewness and no outliers. For n ≥ 30, use t-distribution even for skewed distributions due to the Central Limit Theorem.

Key Formulas Summary

  • Sample Standard Deviation:

  • t-Statistic:

  • Confidence Interval for Mean (σ unknown, n < 30):

  • Margin of Error:

  • Sample Size:

  • Chi-Square Statistic:

  • Confidence Interval for Variance: Lower: , Upper:

Pearson Logo

Study Prep