Skip to main content
Back

Estimating Population Proportions and Means: Confidence Intervals and Sample Size Determination

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Estimating Population Proportions

Point Estimates and Proportions

When estimating a population proportion, we use the proportion observed in a sample as our best estimate. This section introduces the concepts and methods for estimating population proportions using sample data.

  • Population Proportion (p): The true proportion of individuals in a population with a certain characteristic (unknown).

  • Sample Proportion (\( \hat{p} \)): The proportion observed in the sample, calculated as \( \hat{p} = \frac{x}{n} \), where x is the number of successes and n is the sample size.

  • Point Estimate: A single value used to approximate a population parameter. For proportions, \( \hat{p} \) is the best point estimate of p.

  • Example: In a survey of 1007 adults, 85% knew what Twitter is. Here, \( \hat{p} = 0.85 \).

Confidence Intervals for Population Proportions

A confidence interval provides a range of values within which the population proportion is likely to lie, with a specified level of confidence.

  • Confidence Interval (CI): An interval estimate for a population parameter, expressed as \( \hat{p} \pm E \), where E is the margin of error.

  • Margin of Error (E): The maximum likely difference between the sample proportion and the true population proportion.

  • Formula for CI:

  • Margin of Error (E):

  • Critical Value (\( z_{\alpha/2} \)): The z-score corresponding to the desired confidence level (e.g., 1.96 for 95% confidence).

  • Requirements for CI:

    • The sample is a simple random sample.

    • The binomial distribution conditions are satisfied (fixed number of independent trials, two outcomes, constant probability).

    • At least 5 successes and 5 failures in the sample.

Finding Critical Values

  • For a confidence level C, \( \alpha = 1 - C \), and \( z_{\alpha/2} \) is found using the standard normal diswhattribution.

  • Examples:

    • For 95% confidence, \( z_{\alpha/2} = 1.96 \).

    • For 99% confidence, \( z_{\alpha/2} = 2.575 \).

Interpreting Confidence Intervals

  • A 95% confidence interval means that if we were to take many samples and build a CI from each, about 95% of those intervals would contain the true population proportion.

  • Example: If the 95% CI for the proportion of adults who know Twitter is (0.83, 0.87), we are 95% confident that the true proportion lies within this interval.

Calculating Point Estimate and Margin of Error from CI

  • Given CI limits \( L \) and \( U \):

Sample Size for Estimating Proportion

  • To determine the required sample size for a desired margin of error \( E \) and confidence level:

  • If no prior estimate for \( \hat{p} \), use \( \hat{p} = 0.5 \) for maximum variability.

  • Example: To estimate the proportion of adults buying clothing online within 3 percentage points (E = 0.03) at 95% confidence, with \( \hat{p} = 0.66 \):

  • If no estimate for \( \hat{p} \):

Considerations When Analyzing Polls

  • Sample should be random, not voluntary response.

  • Confidence level and sample size should be reported.

  • Population size is usually not a factor in reliability; sample size and method are more important.

Estimating Population Means

Point Estimates and Means

We use the sample mean to estimate the population mean. The methods differ depending on whether the population standard deviation is known.

  • Population Mean (\( \mu \)): The true mean of the population.

  • Sample Mean (\( \bar{x} \)): The mean of the sample data.

  • Sample Standard Deviation (s): The standard deviation of the sample.

  • Population Standard Deviation (\( \sigma \)): The standard deviation of the population (may be known or unknown).

Confidence Interval for Mean (\( \sigma \) Unknown)

When the population standard deviation is unknown, use the t-distribution.

  • Formula:

  • Requirements:

    • Population is normally distributed or sample size \( n > 30 \).

    • Sample is a simple random sample.

  • t Critical Value (\( t_{\alpha/2} \)): Based on degrees of freedom \( df = n - 1 \).

  • Example: In a study of garlic's effect on cholesterol, 49 subjects had a mean change of 0.4 mg/dL (s = 21.0). The 95% CI is:

Confidence Interval for Mean (\( \sigma \) Known)

When the population standard deviation is known, use the z-distribution.

  • Formula:

  • Requirements:

    • Sample is a simple random sample.

    • Population is normally distributed or \( n > 30 \).

    • \( \sigma \) is known.

  • Example: For a sample of 40 men with mean weight 172.55 lb and \( \sigma = 26 \) lb, the 95% CI is:

Choosing the Appropriate Distribution

  • Use t-distribution (TInterval) when \( \sigma \) is unknown.

  • Use z-distribution (ZInterval) when \( \sigma \) is known.

  • Mnemonic: "T" in "NOT known" means use TInterval.

Sample Size for Estimating a Mean

  • When \( \sigma \) is known, the required sample size for margin of error \( E \) is:

  • If \( \sigma \) is unknown, estimate it using:

    • Range rule of thumb: \( \sigma \approx \frac{\text{range}}{4} \)

    • Sample standard deviation from preliminary data

    • Results from previous studies

  • Example: To estimate the mean IQ of statistics students within 3 points (\( \sigma = 15 \)), 95% confidence:

Summary Table: Choosing the Correct Confidence Interval

Situation

Distribution

Calculator Function

Proportion (p)

z-distribution

1-PropZInt

Mean, \( \sigma \) known

z-distribution

ZInterval

Mean, \( \sigma \) unknown

t-distribution

TInterval

Practice Examples

  • Estimating Proportion: In a clinical trial, 574 babies were born, 525 were girls. 95% CI for proportion of girls:

  • Estimating Mean (\( \sigma \) unknown): 15 cookies, calories listed. Use TInterval for 99% CI.

  • Estimating Mean (\( \sigma \) known): 40 students, mean time 58.3 sec, \( \sigma = 9.5 \) sec. Use ZInterval for 95% CI.

  • Estimating Mean (\( \sigma \) unknown): 106 body temperatures, \( \bar{x} = 98.2 \), \( s = 0.62 \), 99% CI.

  • Estimating Mean (\( \sigma \) unknown): Mercury in tuna sushi, 7 samples, 98% CI. Compare to FDA guideline of 1 ppm.

Key Formulas Summary

  • CI for Proportion:

  • CI for Mean (\( \sigma \) known):

  • CI for Mean (\( \sigma \) unknown):

  • Sample Size for Proportion:

  • Sample Size for Mean:

Additional info: In practice, always check assumptions (random sampling, normality, known/unknown \( \sigma \)) before applying these formulas. Use technology (e.g., calculators or statistical software) for computation and critical values.

Pearson Logo

Study Prep