BackEstimating Parameters and Determining Sample Sizes (Chapter 7.2 Study Notes)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Estimating Parameters and Determining Sample Sizes
Introduction
This section covers the statistical methods used to estimate population parameters, specifically the population mean, and how to determine the appropriate sample size for such estimations. The focus is on constructing and interpreting confidence intervals and understanding the use of the Student t distribution when the population standard deviation is unknown.
Estimating a Population Mean
Point Estimate and Margin of Error
Point Estimate: The best single-value estimate of a population parameter. For the mean, the sample mean (\( \bar{x} \)) is used as the point estimate of the population mean (\( \mu \)).
Margin of Error (E): The maximum likely difference between the sample mean and the true population mean.
Formulas:
Point estimate of mean:
Margin of error:
Example: If the 95% confidence interval for the mean age of STLCC students is (24.6, 25.0), then the point estimate is 24.8 and the margin of error is 0.2.
Interpreting Confidence Intervals
A confidence interval provides a range of values within which the true population mean is likely to fall, with a specified level of confidence (e.g., 95%).
Interpretation: "We are 95% confident that the sample mean of 24.8 years is within 0.2 years of the true mean of the population."
Constructing Confidence Intervals for the Mean (\( \sigma \) Not Known)
Requirements
The sample must be a simple random sample.
Either the population is normally distributed, or the sample size n is greater than 30.
Notation
\( \mu \): population mean
\( \bar{x} \): sample mean
\( s \): sample standard deviation
\( n \): number of sample values
\( E \): margin of error
Confidence Interval Formula
The confidence interval for the mean when \( \sigma \) is not known is: or
Key Concepts
Point Estimate: The sample mean is the best estimate of the population mean.
Confidence Interval: Use sample data to construct and interpret a confidence interval estimate of the true value of a population mean.
Sample Size: Find the sample size necessary to estimate a population mean with a specified margin of error and confidence level.
The Student t Distribution
Key Points
The Student t distribution is used when the population standard deviation (\( \sigma \)) is unknown and the sample size is small.
It has a symmetric, bell-shaped curve like the standard normal distribution but with heavier tails (more variability).
The mean of the t distribution is 0.
The standard deviation of the t distribution is greater than 1 and depends on the sample size.
As the sample size increases, the t distribution approaches the standard normal (z) distribution.
Degrees of Freedom
The degrees of freedom for a sample is n - 1, where n is the sample size.
Degrees of freedom reflect the number of independent values that can vary in the calculation of a statistic.
Choosing the Correct Distribution
The choice of distribution depends on what is known about the population and the sample size.
Conditions | Method |
|---|---|
\( \sigma \) not known and normally distributed population or \( \sigma \) not known and \( n > 30 \) | Use Student t distribution |
\( \sigma \) known and normally distributed population or \( \sigma \) known and \( n > 30 \) | Use normal (z) distribution |
Population is not normally distributed and \( n \leq 30 \) | Use the bootstrapping method or a nonparametric method |
Finding the Sample Size Required to Estimate a Population Mean
Requirement
The sample must be a simple random sample.
Methods for Dealing with Unknown \( \sigma \) (Standard Deviation)
Range Rule of Thumb: Estimate \( \sigma \) as range/4, where the range is the difference between the maximum and minimum sample values.
Start and Improve: Begin collecting data without knowing \( \sigma \), use the sample standard deviation from initial data, and refine as more data are collected.
Use Prior Results: Use the standard deviation from previous studies or related data. When in doubt, use a larger value for \( \sigma \) to ensure the sample size is not underestimated.
Sample Size Calculation Example
Suppose you want a 95% confidence interval, a standard deviation of 13.5 (estimated by the range rule of thumb), and a margin of error of 3.
Using statistical software (e.g., StatCrunch), you can input these values to compute the required sample size.
Result: A sample size of 81 is needed for the given parameters.
Application Example: STLCC Student Ages
Data Summary
Mean age: 24.8 years
Standard deviation: 9.8 years
Sample size: 15,649 students
Tabular Data: STLCC Student Age Distribution (Fall 2024)
Age Group | Total | Men | Women |
|---|---|---|---|
All Students | 15,649 | 5,068 | 10,581 |
Under 18 | 2,613 | 931 | 1,682 |
18-19 | 3,614 | 1,447 | 2,169 |
20-21 | 2,273 | 867 | 1,406 |
22-24 | 1,096 | 450 | 646 |
25-29 | 1,495 | 540 | 1,419 |
30-34 | 1,065 | 405 | 660 |
35-39 | 754 | 170 | 584 |
40-64 | 849 | 216 | 617 |
65 and over | 67 | 27 | 40 |
Age Unknown/unreported | 0 | 0 | 0 |
Summary Table: Confidence Interval Options
Interval Format | Example |
|---|---|
Parentheses | (24.6, 25.0) |
Concatenated | 24.6 25.0 |
Point Estimate ± Margin of Error | 24.8 ± 0.2 |
Additional Info
Wider confidence intervals correspond to higher confidence levels (e.g., 99% interval is wider than 95%).
When comparing confidence intervals for means and proportions, the interval with the higher confidence level or greater variability will be wider.