BackEstimating Population Proportions and Determining Sample Sizes (Chapter 7 Study Notes)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Estimating Parameters and Determining Sample Sizes
Overview
This chapter introduces statistical methods for estimating population parameters, specifically focusing on proportions, and determining the appropriate sample sizes for reliable inference. The main topics include point estimation, confidence intervals, margin of error, and sample size calculations.
Estimating a Population Proportion
Key Concepts
Point Estimate: The sample proportion (\( \hat{p} \)) is the best point estimate of the population proportion p.
Confidence Interval: A range of values used to estimate the true value of a population proportion.
Sample Size: The number of observations required to estimate a population proportion with a specified margin of error and confidence level.
Point Estimate
A point estimate is a single value used to estimate a population parameter. For proportions, the sample proportion (\( \hat{p} \)) is used because it is unbiased and consistent.
Unbiased Estimator: A statistic whose sampling distribution has a mean equal to the population parameter.
Example: In a survey of 950 students, 53% take online courses. The best point estimate of the proportion of all students who take online courses is 0.53 (or 53%).
Confidence Intervals for Population Proportion
Definition
A confidence interval (CI) is a range of values used to estimate the true value of a population parameter. It is expressed as \( \hat{p} \pm E \) or (\hat{p} - E, \hat{p} + E), where E is the margin of error.
Confidence Level
The confidence level is the probability 1 - \alpha (e.g., 0.95 for 95%) that the confidence interval contains the population parameter, assuming repeated sampling.
Also called degree of confidence or confidence coefficient.
Relationship Between Confidence Level and \( \alpha \)
Most Common Confidence Levels | Corresponding Values of \( \alpha \) |
|---|---|
90% (or 0.90) | \( \alpha = 0.10 \) |
95% (or 0.95) | \( \alpha = 0.05 \) |
99% (or 0.99) | \( \alpha = 0.01 \) |
Critical Values
For the standard normal distribution, a critical value is a z-score that separates significant results from non-significant ones. The value \( z_{\alpha/2} \) corresponds to the area \( \alpha/2 \) in the right tail.
For a 95% confidence level, use a cumulative left area of 0.9750 (not 0.95).
Confidence Level | \( \alpha \) | Critical Value, \( z_{\alpha/2} \) |
|---|---|---|
90% | 0.10 | 1.645 |
95% | 0.05 | 1.96 |
99% | 0.01 | 2.575 |
Margin of Error
The margin of error (E) is the maximum likely amount by which the sample statistic differs from the population parameter. For proportions:
Formula:
Where \( \hat{q} = 1 - \hat{p} \)
Interpreting Confidence Intervals
Correct: "We are 95% confident that the interval from 0.405 to 0.455 actually does contain the true value of the population proportion p."
Incorrect: "There is a 95% chance that the true value of p will fall between 0.405 and 0.455."
Incorrect: "95% of sample proportions will fall between 0.405 and 0.455."
The Process Success Rate
A 95% confidence level means that, in the long run, 95% of confidence intervals constructed from repeated samples will contain the true population proportion.
Requirements for Constructing a Confidence Interval for p
The sample is a simple random sample.
The binomial distribution conditions are satisfied: fixed number of trials, independence, two outcome categories, constant probabilities.
At least 5 successes and 5 failures in the sample.
Procedure for Constructing a Confidence Interval for p
Verify requirements are satisfied.
Find the critical value \( z_{\alpha/2} \) for the desired confidence level.
Calculate the margin of error:
Compute the confidence interval limits: \( \hat{p} - E \) and \( \hat{p} + E \).
Round the limits to three significant digits.
Example: Constructing a Confidence Interval
Given: Survey of 950 students, 53% take online courses (n = 950, \hat{p} = 0.53).
Find: Margin of error for 95% confidence interval.
Solution:
Critical value: z_{\alpha/2} = 1.96
Calculate \hat{q} = 1 - 0.53 = 0.47
Margin of error:
Confidence interval: →
Interpretation: We cannot safely conclude that more than 50% of undergraduates take online courses, since the interval includes values below 0.50.
Determining Sample Size for Estimating a Population Proportion
Objective
Determine the required sample size n to estimate a population proportion p with a specified margin of error E and confidence level.
Sample Size Formulas
If an estimate of p is known:
If no estimate of p is known:
Round-Off Rule: Always round up to the next whole number to ensure adequacy.
Example: Determining Sample Size
Given: Prior survey: 79% shop online. Want 95% confidence, margin of error 0.03.
With prior estimate:
→ 709 adults
No prior estimate:
→ 1068 adults
Interpretation: Without prior knowledge, a larger sample is required to achieve the same margin of error.
Coverage Probability and Confidence Interval Methods
Coverage Probability
The coverage probability of a confidence interval is the proportion of intervals that contain the true population parameter when repeated samples are taken.
Alternative Confidence Interval Methods
Wald Confidence Interval: Standard method, best for teaching but may not always achieve the nominal coverage probability.
Plus Four Method: Add 2 successes and 2 failures to the sample, then use the Wald formula. Improves coverage probability.
Wilson Score Interval: More accurate, especially for small samples, but more complex to calculate.
Clopper-Pearson Method: "Exact" method based on the binomial distribution; tends to be conservative (actual coverage probability ≥ nominal level).
Note: There is no universal agreement on the best method; the Wald interval is commonly used for introductory courses, while plus four and Wilson score intervals offer improved accuracy.
Summary Table: Confidence Interval Methods for Proportions
Method | Procedure | Coverage Probability | Complexity |
|---|---|---|---|
Wald | Standard formula | May be less than nominal | Simple |
Plus Four | Add 2 successes and 2 failures | Closer to nominal | Simple |
Wilson Score | Special formula | Very close to nominal | Moderate |
Clopper-Pearson | Exact binomial | Conservative (≥ nominal) | Complex |
Best Practices for Poll Analysis
Ensure the sample is a simple random sample.
Report the confidence level and sample size.
Reliability depends on sampling method and sample size, not population size.
Do not dismiss poll results solely because the sample is a small percentage of the population.