BackEstimating Parameters and Determining Sample Sizes: Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Estimating Parameters and Determining Sample Sizes
Introduction to Estimation in Statistics
Statistical estimation involves using sample data to make inferences about population parameters. This process is central to inferential statistics, allowing us to estimate unknown values such as population means, proportions, and variances.
Inferential Statistics: Uses sample data to estimate population parameters.
Parameters: Numerical characteristics of a population (e.g., mean, proportion).
Statistics: Numerical characteristics calculated from a sample.
Main Ideas in Estimation
Point Estimate
A point estimate is a single value calculated from sample data and used as the best guess for a population parameter.
Definition: The value of a sample statistic used to estimate a population parameter.
Example: The sample mean is the point estimate of the population mean .
Limitation: A point estimate does not provide information about its accuracy or reliability.
Confidence Interval
A confidence interval (CI) is a range of values, derived from sample statistics, that is likely to contain the population parameter.
Definition: An interval estimate that gives both lower and upper bounds for the parameter.
Interpretation: Helps quantify uncertainty and express how confident we are in our estimate.
Example: A 95% confidence interval for a population mean might be (100, 110).
Confidence Level
The confidence level is the probability that the confidence interval actually contains the population parameter.
Common Levels: 90%, 95%, 99%.
Interpretation: "We are 95% confident that the interval from X to Y contains the true population mean."
Misconception: It is incorrect to say "There is a 95% chance that..." after the interval is calculated.
Margin of Error
The margin of error (E) quantifies the maximum likely difference between the sample statistic and the true population parameter.
Formula for Population Proportion: where is the sample proportion, , is the sample size, and is the critical value.
Application: Used to construct confidence intervals.
Critical Value
A critical value is a z-score (or t-score) that separates values of the sample statistic that are "likely" to occur from those that are "unlikely" to occur.
Notation: for the standard normal distribution.
Relation to Confidence Level: .
Confidence Level | α | Critical Value |
|---|---|---|
90% | 0.10 | 1.645 |
95% | 0.05 | 1.96 |
99% | 0.01 | 2.575 |
Estimating Population Proportion
Confidence Interval for Population Proportion
The confidence interval for a population proportion is given by:
Where
Example
In a study of 1,211 randomly selected medical malpractice lawsuits, it was found that 70% of them were dropped or dismissed. The point estimate of the proportion is .
Construct a 99% confidence interval: Use the formula above with .
Sample Size Determination
Sample Size Requirements for Proportion
If is known:
If is unknown:
Always round up to the nearest whole number.
Sample Size Example
Suppose you want to determine the percentage of adults in Oklahoma who have a cell phone but no landline. If no prior estimate is available, use for maximum sample size.
Estimating the Population Mean
Case 1: Population Standard Deviation () is Known
Population must be normally distributed or .
Margin of error:
Confidence interval:
Sample Size for Mean
Round up to the nearest whole number.
Case 2: Population Standard Deviation () is Unknown
Use the Student t distribution instead of the standard normal distribution.
Margin of error:
Confidence interval:
is the sample standard deviation.
Student t Distribution
Symmetric, bell-shaped, but wider than the standard normal distribution.
Shape depends on degrees of freedom ().
Approaches the normal distribution as increases.
When to Use z vs. t
Use z if is known and population is normal or .
Use t if is unknown and population is normal or .
Estimating Population Variance and Standard Deviation
Chi-Square Distribution
Used for estimating population variance () and standard deviation ().
Formula:
Degrees of freedom:
Distribution is not symmetric; only takes non-negative values.
Confidence Interval for Population Variance
and are right and left critical values for the chi-square distribution.
Confidence Interval for Population Standard Deviation
Example
To design theater seats, a sample of sitting heights (in mm) of adult women is obtained. Construct a 95% confidence interval for the standard deviation of sitting heights.
Technology in Estimation
Statistical software and online calculators can be used to compute confidence intervals and sample sizes for proportions, means, and standard deviations.
Inputs typically include sample size, sample mean/proportion, standard deviation, and confidence level.
Summary Table: Distributions Used in Estimation
Parameter | Distribution | When Used |
|---|---|---|
Proportion () | Normal (z) | Large samples, known/unknown |
Mean () | Normal (z) or Student t | z: known; t: unknown |
Variance () | Chi-square | Population normal, small or large |
Groupwork Example
The level of nicotine (in mg) was measured in 25 menthol cigarettes. The resulting data had a standard deviation of 0.38 mg. Find the degrees of freedom, the left and right critical values, and an 80% confidence interval estimate of the standard deviation of nicotine amounts.
Degrees of freedom:
Critical values: Use chi-square tables for 80% confidence and .
Confidence interval: Apply formulas for standard deviation above.
Additional info: The notes include references to Statdisk.com for technology-based calculations, which is a common statistical software for classroom use.