BackLecture 27: Confidence Intervals for Mean, Variance, and Standard Deviation (Student's t and Chi-Square Distributions)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Confidence Intervals for Population Mean
Introduction
Confidence intervals provide a range of plausible values for population parameters, such as the mean, based on sample data. When the population standard deviation (σ) is unknown and the sample size (n) is small, the Student's t-distribution is used instead of the normal distribution.
Estimating the Mean When σ is Unknown
Point Estimator: The sample standard deviation (s) is used to estimate the unknown population standard deviation (σ).
Formula for Sample Standard Deviation:
When σ is unknown and n < 30, the sampling distribution of the sample mean (\overline{X}) is not normal, but follows the Student's t-distribution.
Student's t-Distribution
Definition and Properties
The Student's t-distribution is used when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown.
Degrees of Freedom (df): The t-distribution depends on the degrees of freedom, calculated as df = n - 1.
Symmetry: The t-distribution is centered at 0 and is symmetric about 0.
Area Under the Curve: The total area is 1; the area to the right of 0 equals the area to the left of 0, each being 0.5.
Tails: The t-distribution has fatter tails than the standard normal distribution, reflecting greater variability due to estimating σ with s.
Convergence: As n increases, the t-distribution approaches the standard normal distribution.
Formula for t-Statistic
Critical Values of t-Distribution
Notation: is the t-value with area α to its right under the t-distribution with df = n - 1.
Critical values are found using t-tables for the desired confidence level and degrees of freedom.
Confidence Interval Formula (t-Distribution)
This gives a (1-α)100% confidence interval for the population mean μ when σ is unknown and n < 30.
Example: Confidence Interval Calculation
Sample: n = 20, ,
95% CI:
98% CI:
Summary Table: Confidence Intervals for Mean
Sample Size | σ Known | σ Unknown |
|---|---|---|
Large (n ≥ 30) | ||
Small (n < 30) |
Margin of Error
Definition
The Margin of Error (ME) quantifies the degree of precision of a confidence interval.
Formula (when σ is known):
Sample Size Determination
Introduction
Determining the minimum sample size required to achieve a desired margin of error is crucial in designing experiments.
Depends on population standard deviation (σ), desired margin of error (E), and confidence level (1-α).
Formula:
Round up to the next integer.
Example
Given: , ,
Estimating Standard Deviation
Point Estimator
When σ is unknown, estimate it using the sample standard deviation (s):
Chi-Square Distribution
Definition
If a simple random sample of size n is obtained from a normally distributed population with mean μ and standard deviation σ, then:
The chi-square distribution has n-1 degrees of freedom.
Characteristics
Not symmetric.
Shape depends on degrees of freedom; becomes more symmetric as degrees of freedom increase.
Values are nonnegative (≥ 0).
Critical Values
Critical value is the value with area α to its right under the chi-square curve.
Found using chi-square tables for the desired confidence level and degrees of freedom.
Confidence Intervals for Variance and Standard Deviation
Formula for Variance
For a (1-α)100% confidence interval for σ²:
Bound | Formula |
|---|---|
Lower | |
Upper |
To find the confidence interval for σ, take the square root of the lower and upper bounds.
Example: Confidence Interval for Variance and Standard Deviation
Sample: n = 12,
Degrees of freedom:
Critical values: ,
Lower bound for variance:
Upper bound for variance:
Lower bound for standard deviation:
Upper bound for standard deviation:
Assumptions and Cautions
Confidence intervals for variance and standard deviation are not of the form "point estimate ± margin of error" due to the non-symmetric sampling distribution.
Methods require data from a normal distribution. If normality is not satisfied, do not use these methods.
Normality Condition and Robustness
Checking Normality
For n < 30: Use normal probability plots and boxplots to check for normality and outliers.
If data are approximately normal and have no outliers, use the t-distribution for confidence intervals.
Robustness
For n < 15: Use t-distribution if data are symmetric and have no outliers.
For 15 ≤ n < 30: Use t-distribution if data do not have extreme skewness and no outliers.
For n ≥ 30: Use t-distribution even for skewed distributions, relying on the Central Limit Theorem and Law of Large Numbers.
Trivia: Origin of Student's t-Distribution
The Student's t-distribution was discovered by William Sealy Gosset, who published under the pen name "Student" while working at the Guinness brewery. The letter t for the statistic was popularized later by R. A. Fisher.
Summary Table: Critical Values of Chi-Square Distribution
Degrees of Freedom | Area to the Right (0.05) | Area to the Right (0.025) | Area to the Right (0.01) |
|---|---|---|---|
11 | 19.675 | 22.618 | 25.877 |
13 | 22.36 | 25.688 | 29.819 |
18 | 28.869 | 31.526 | 34.805 |
Additional info: Table values inferred from provided images and context. |
Example: Finding Chi-Square Critical Values
For 18 degrees of freedom, the values that separate the middle 95% of the distribution from the remaining 2.5% in each tail are:
Options for Using t-Distribution
Option 1 (Preferred): For n < 30, check normality and outliers using plots. If data are approximately normal and have no outliers, use t-distribution.
Option 2 (Robustness): For n < 15, use t-distribution if data are symmetric and have no outliers. For 15 ≤ n < 30, use t-distribution if data do not have extreme skewness and no outliers. For n ≥ 30, use t-distribution even for skewed distributions due to the Central Limit Theorem.
Key Formulas Summary
Sample Standard Deviation:
t-Statistic:
Confidence Interval for Mean (σ unknown, n < 30):
Margin of Error:
Sample Size:
Chi-Square Statistic:
Confidence Interval for Variance: Lower: , Upper: