(Lecture 18) Statistical Inference: Point and Interval Estimation of Population Parameters

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Statistical Inference Methods

Overview of Statistical Inference

Statistical inference methods are essential tools in statistics for drawing conclusions about population parameters based on sample data. These methods rely on probability calculations and the assumption that data are collected via random sampling or randomized experiments.

Probability Calculations: Inference methods use probability to quantify uncertainty, typically referring to the sampling distribution of a statistic.
Sampling Distribution: The distribution of a statistic (such as the sample mean or proportion) over repeated samples from the population. For large samples, this is often approximately normal.

Types of Statistical Inference

Estimation of Population Parameters: Determining plausible values for unknown population parameters.
Testing Hypotheses: Assessing claims about population parameters using sample data.

The most informative estimation method constructs a confidence interval, an interval believed to contain the true parameter value.

Point Estimate and Interval Estimate

Definitions

Point Estimate: A single value that serves as the best guess for a population parameter (e.g., sample mean x̄ or sample proportion p̂).
Interval Estimate: A range of values believed to contain the actual value of the parameter, providing more information about uncertainty.

Point estimates alone are not sufficiently informative because they do not indicate how close the estimate is likely to be to the true parameter. Interval estimates incorporate a margin of error to gauge accuracy.

Properties of Point Estimators

Unbiasedness

Unbiased Estimator: An estimator whose sampling distribution is centered at the parameter it estimates.
The sample mean x̄ is an unbiased estimator of the population mean μ.
The sample proportion p̂ is an unbiased estimator of the population proportion p.

Standard Deviation

A good estimator has a small standard deviation compared to other estimators, meaning it tends to be closer to the parameter.
The sample mean has a smaller standard deviation than the sample median when estimating the center of a normal distribution.

Confidence Interval

Definition and Confidence Level

A confidence interval is an interval containing the most plausible values for a parameter. The confidence level is the probability that the method produces an interval containing the parameter, commonly set at 0.95 (95%).

Logic Behind Constructing a Confidence Interval

Sampling Distribution of the Sample Proportion

For estimating a proportion, the sampling distribution of the sample proportion p̂ is approximately normal for large random samples, provided both np ≥ 15 and n(1-p) ≥ 15.

Mean: Equal to the population proportion p.
Standard deviation:

Margin of Error

Approximately 95% of a normal distribution falls within 1.96 standard deviations of the mean.
For a sample proportion, the margin of error is .
The confidence interval is:

Example: Confidence Interval for a Proportion

Suppose 31% of 1285 respondents agreed with a statement. Estimated standard deviation is 0.01.

Margin of error:
Confidence interval:
Interpretation: The population proportion is predicted to be between 0.29 and 0.33.

Constructing a Confidence Interval to Estimate a Population Proportion

General Formula

Point estimate: Sample proportion p̂
Standard error:
Confidence interval: , where z is the z-score for the desired confidence level (e.g., 1.96 for 95%)

Example: Willingness to Pay Higher Prices

Sample size: n = 1361, number willing: 637
Sample proportion:
Standard error:
Confidence interval:
Interpretation: With 95% confidence, the proportion is between 44% and 49%.

Sample Size Requirements

For validity, require at least 15 successes and 15 failures: np ≥ 15 and n(1-p) ≥ 15.

Confidence Levels Other Than 95%

Higher confidence levels (e.g., 99%) increase the chance of correct inference but result in wider intervals.
There is a trade-off between margin of error and confidence level.

Example: Influenza Vaccine

n = 3900, number with flu: 26
Sample proportion:
Standard error:
99% confidence interval:

General Formula for Confidence Interval for a Population Proportion

Confidence interval:
Common z-scores: 1.96 for 95%, 2.58 for 99%

Confidence Level	Error Probability	z-score
90%	0.10	1.645
95%	0.05	1.96
99%	0.01	2.58

Interpretation of Confidence Level

95% confidence means that, in the long run, 95% of intervals constructed from repeated samples will contain the true parameter.
It does not mean there is a 95% probability that the parameter is in a specific interval for a given sample.

Constructing a Confidence Interval to Estimate a Population Mean

General Formula

Point estimate: Sample mean x̄
Standard error: where s is the sample standard deviation
Confidence interval: , where t is the t-score for the desired confidence level and degrees of freedom df = n-1

t Distribution and Its Properties

The t distribution is bell-shaped and symmetric about 0, similar to the standard normal distribution but with thicker tails.
Standard deviation depends on degrees of freedom (df).
For inference about a mean, df = n-1.

Example: Buying on eBay

Sample size: n = 11
Sample mean:
Sample standard deviation:
Standard error:
Degrees of freedom: 10
t-score for 95% confidence, df = 10: 2.228
Confidence interval:
Interpretation: With 95% confidence, the mean closing price is between $569 and $599.

Robustness of the t Method

t intervals are robust to most violations of the normality assumption, but outliers should be checked.
Randomization in data collection is essential for valid inference.

Choosing the Sample Size for a Study

Sample Size for Estimating a Population Proportion

To achieve a desired margin of error m for a confidence interval, use:

z-score is based on the confidence level (e.g., 1.96 for 95%).
If p is unknown, use p = 0.5 for a conservative estimate.

Example: Sample Size for Exit Poll

Desired margin of error: 0.02
Expected proportion: 0.50
z-score: 1.96
Required sample size:

Sample Size for Estimating a Population Mean

To achieve margin of error m:

If σ is unknown, use an estimate from prior studies or divide the estimated range by 4 (for bell-shaped distributions).

Example: Estimating Mean Education in South Africa

Desired margin of error: 1 year
Estimated range: 0 to 20 years
Estimated standard deviation:
Required sample size:

Other Factors Affecting Sample Size

Desired precision (margin of error)
Confidence level
Variability in the data
Cost

Confidence Interval for a Proportion with Small Samples

If there are fewer than 15 successes or failures, adjust by adding 2 to both the number of successes and failures (add 4 to sample size).

Summary Table: Confidence Interval Formulas

Parameter	Point Estimate	Standard Error	Confidence Interval
Population Proportion	p̂
Population Mean	x̄

Additional info: The notes cover the core concepts of confidence intervals for proportions and means, including the use of the normal and t distributions, margin of error, sample size determination, and interpretation of confidence levels. Examples illustrate practical applications in survey and experimental contexts.