Back(Lecture 18) Statistical Inference: Point and Interval Estimation of Population Parameters
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Statistical Inference Methods
Overview of Statistical Inference
Statistical inference methods are essential tools in statistics for drawing conclusions about population parameters based on sample data. These methods rely on probability calculations and the assumption that data are collected via random sampling or randomized experiments.
Probability Calculations: Inference methods use probability to quantify uncertainty, typically referring to the sampling distribution of a statistic.
Sampling Distribution: The distribution of a statistic (such as the sample mean or proportion) over repeated samples from the population. For large samples, this is often approximately normal.
Types of Statistical Inference
Estimation of Population Parameters: Determining plausible values for unknown population parameters.
Testing Hypotheses: Assessing claims about population parameters using sample data.
The most informative estimation method constructs a confidence interval, an interval believed to contain the true parameter value.
Point Estimate and Interval Estimate
Definitions
Point Estimate: A single value that serves as the best guess for a population parameter (e.g., sample mean x̄ or sample proportion p̂).
Interval Estimate: A range of values believed to contain the actual value of the parameter, providing more information about uncertainty.
Point estimates alone are not sufficiently informative because they do not indicate how close the estimate is likely to be to the true parameter. Interval estimates incorporate a margin of error to gauge accuracy.
Properties of Point Estimators
Unbiasedness
Unbiased Estimator: An estimator whose sampling distribution is centered at the parameter it estimates.
The sample mean x̄ is an unbiased estimator of the population mean μ.
The sample proportion p̂ is an unbiased estimator of the population proportion p.
Standard Deviation
A good estimator has a small standard deviation compared to other estimators, meaning it tends to be closer to the parameter.
The sample mean has a smaller standard deviation than the sample median when estimating the center of a normal distribution.
Confidence Interval
Definition and Confidence Level
A confidence interval is an interval containing the most plausible values for a parameter. The confidence level is the probability that the method produces an interval containing the parameter, commonly set at 0.95 (95%).
Logic Behind Constructing a Confidence Interval
Sampling Distribution of the Sample Proportion
For estimating a proportion, the sampling distribution of the sample proportion p̂ is approximately normal for large random samples, provided both np ≥ 15 and n(1-p) ≥ 15.
Mean: Equal to the population proportion p.
Standard deviation:
Margin of Error
Approximately 95% of a normal distribution falls within 1.96 standard deviations of the mean.
For a sample proportion, the margin of error is .
The confidence interval is:
Example: Confidence Interval for a Proportion
Suppose 31% of 1285 respondents agreed with a statement. Estimated standard deviation is 0.01.
Margin of error:
Confidence interval:
Interpretation: The population proportion is predicted to be between 0.29 and 0.33.
Constructing a Confidence Interval to Estimate a Population Proportion
General Formula
Point estimate: Sample proportion p̂
Standard error:
Confidence interval: , where z is the z-score for the desired confidence level (e.g., 1.96 for 95%)
Example: Willingness to Pay Higher Prices
Sample size: n = 1361, number willing: 637
Sample proportion:
Standard error:
Confidence interval:
Interpretation: With 95% confidence, the proportion is between 44% and 49%.
Sample Size Requirements
For validity, require at least 15 successes and 15 failures: np ≥ 15 and n(1-p) ≥ 15.
Confidence Levels Other Than 95%
Higher confidence levels (e.g., 99%) increase the chance of correct inference but result in wider intervals.
There is a trade-off between margin of error and confidence level.
Example: Influenza Vaccine
n = 3900, number with flu: 26
Sample proportion:
Standard error:
99% confidence interval:
General Formula for Confidence Interval for a Population Proportion
Confidence interval:
Common z-scores: 1.96 for 95%, 2.58 for 99%
Confidence Level | Error Probability | z-score |
|---|---|---|
90% | 0.10 | 1.645 |
95% | 0.05 | 1.96 |
99% | 0.01 | 2.58 |
Interpretation of Confidence Level
95% confidence means that, in the long run, 95% of intervals constructed from repeated samples will contain the true parameter.
It does not mean there is a 95% probability that the parameter is in a specific interval for a given sample.
Constructing a Confidence Interval to Estimate a Population Mean
General Formula
Point estimate: Sample mean x̄
Standard error: where s is the sample standard deviation
Confidence interval: , where t is the t-score for the desired confidence level and degrees of freedom df = n-1
t Distribution and Its Properties
The t distribution is bell-shaped and symmetric about 0, similar to the standard normal distribution but with thicker tails.
Standard deviation depends on degrees of freedom (df).
For inference about a mean, df = n-1.
Example: Buying on eBay
Sample size: n = 11
Sample mean:
Sample standard deviation:
Standard error:
Degrees of freedom: 10
t-score for 95% confidence, df = 10: 2.228
Confidence interval:
Interpretation: With 95% confidence, the mean closing price is between $569 and $599.
Robustness of the t Method
t intervals are robust to most violations of the normality assumption, but outliers should be checked.
Randomization in data collection is essential for valid inference.
Choosing the Sample Size for a Study
Sample Size for Estimating a Population Proportion
To achieve a desired margin of error m for a confidence interval, use:
z-score is based on the confidence level (e.g., 1.96 for 95%).
If p is unknown, use p = 0.5 for a conservative estimate.
Example: Sample Size for Exit Poll
Desired margin of error: 0.02
Expected proportion: 0.50
z-score: 1.96
Required sample size:
Sample Size for Estimating a Population Mean
To achieve margin of error m:
If σ is unknown, use an estimate from prior studies or divide the estimated range by 4 (for bell-shaped distributions).
Example: Estimating Mean Education in South Africa
Desired margin of error: 1 year
Estimated range: 0 to 20 years
Estimated standard deviation:
Required sample size:
Other Factors Affecting Sample Size
Desired precision (margin of error)
Confidence level
Variability in the data
Cost
Confidence Interval for a Proportion with Small Samples
If there are fewer than 15 successes or failures, adjust by adding 2 to both the number of successes and failures (add 4 to sample size).
Summary Table: Confidence Interval Formulas
Parameter | Point Estimate | Standard Error | Confidence Interval |
|---|---|---|---|
Population Proportion | p̂ | ||
Population Mean | x̄ |
Additional info: The notes cover the core concepts of confidence intervals for proportions and means, including the use of the normal and t distributions, margin of error, sample size determination, and interpretation of confidence levels. Examples illustrate practical applications in survey and experimental contexts.