Estimating Parameters and Determining Sample Sizes (Chapter 7.2 Study Notes)

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Estimating Parameters and Determining Sample Sizes

Introduction

This section covers the statistical methods used to estimate population parameters, specifically the population mean, and how to determine the appropriate sample size for such estimations. The focus is on constructing and interpreting confidence intervals and understanding the use of the Student t distribution when the population standard deviation is unknown.

Estimating a Population Mean

Point Estimate and Margin of Error

Point Estimate: The best single-value estimate of a population parameter. For the mean, the sample mean (\( \bar{x} \)) is used as the point estimate of the population mean (\( \mu \)).
Margin of Error (E): The maximum likely difference between the sample mean and the true population mean.

Formulas:

Point estimate of mean:
Margin of error:

Example: If the 95% confidence interval for the mean age of STLCC students is (24.6, 25.0), then the point estimate is 24.8 and the margin of error is 0.2.

Interpreting Confidence Intervals

A confidence interval provides a range of values within which the true population mean is likely to fall, with a specified level of confidence (e.g., 95%).
Interpretation: "We are 95% confident that the sample mean of 24.8 years is within 0.2 years of the true mean of the population."

Constructing Confidence Intervals for the Mean (\( \sigma \) Not Known)

Requirements

The sample must be a simple random sample.
Either the population is normally distributed, or the sample size n is greater than 30.

Notation

\( \mu \): population mean
\( \bar{x} \): sample mean
\( s \): sample standard deviation
\( n \): number of sample values
\( E \): margin of error

Confidence Interval Formula

The confidence interval for the mean when \( \sigma \) is not known is: or

Key Concepts

Point Estimate: The sample mean is the best estimate of the population mean.
Confidence Interval: Use sample data to construct and interpret a confidence interval estimate of the true value of a population mean.
Sample Size: Find the sample size necessary to estimate a population mean with a specified margin of error and confidence level.

The Student t Distribution

Key Points

The Student t distribution is used when the population standard deviation (\( \sigma \)) is unknown and the sample size is small.
It has a symmetric, bell-shaped curve like the standard normal distribution but with heavier tails (more variability).
The mean of the t distribution is 0.
The standard deviation of the t distribution is greater than 1 and depends on the sample size.
As the sample size increases, the t distribution approaches the standard normal (z) distribution.

Degrees of Freedom

The degrees of freedom for a sample is n - 1, where n is the sample size.
Degrees of freedom reflect the number of independent values that can vary in the calculation of a statistic.

Choosing the Correct Distribution

The choice of distribution depends on what is known about the population and the sample size.

Conditions	Method
\( \sigma \) not known and normally distributed population or \( \sigma \) not known and \( n > 30 \)	Use Student t distribution
\( \sigma \) known and normally distributed population or \( \sigma \) known and \( n > 30 \)	Use normal (z) distribution
Population is not normally distributed and \( n \leq 30 \)	Use the bootstrapping method or a nonparametric method

Finding the Sample Size Required to Estimate a Population Mean

Requirement

The sample must be a simple random sample.

Methods for Dealing with Unknown \( \sigma \) (Standard Deviation)

Range Rule of Thumb: Estimate \( \sigma \) as range/4, where the range is the difference between the maximum and minimum sample values.
Start and Improve: Begin collecting data without knowing \( \sigma \), use the sample standard deviation from initial data, and refine as more data are collected.
Use Prior Results: Use the standard deviation from previous studies or related data. When in doubt, use a larger value for \( \sigma \) to ensure the sample size is not underestimated.

Sample Size Calculation Example

Suppose you want a 95% confidence interval, a standard deviation of 13.5 (estimated by the range rule of thumb), and a margin of error of 3.
Using statistical software (e.g., StatCrunch), you can input these values to compute the required sample size.
Result: A sample size of 81 is needed for the given parameters.

Application Example: STLCC Student Ages

Data Summary

Mean age: 24.8 years
Standard deviation: 9.8 years
Sample size: 15,649 students

Tabular Data: STLCC Student Age Distribution (Fall 2024)

Age Group	Total	Men	Women
All Students	15,649	5,068	10,581
Under 18	2,613	931	1,682
18-19	3,614	1,447	2,169
20-21	2,273	867	1,406
22-24	1,096	450	646
25-29	1,495	540	1,419
30-34	1,065	405	660
35-39	754	170	584
40-64	849	216	617
65 and over	67	27	40
Age Unknown/unreported	0	0	0

Summary Table: Confidence Interval Options

Interval Format	Example
Parentheses	(24.6, 25.0)
Concatenated	24.6 25.0
Point Estimate ± Margin of Error	24.8 ± 0.2

Additional Info

Wider confidence intervals correspond to higher confidence levels (e.g., 99% interval is wider than 95%).
When comparing confidence intervals for means and proportions, the interval with the higher confidence level or greater variability will be wider.