BackEstimation and the Central Limit Theorem for Proportion & Mean
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Estimation in Statistics
Population Parameters and Sample Statistics
Estimation is the process of determining the value of a population parameter from a sample statistic. It can involve a single value (point estimate) or a range of values (confidence interval).
Population Parameter: A numerical value that describes a characteristic of a population (e.g., mean, proportion).
Sample Statistic: A numerical value calculated from sample data, used to estimate the population parameter.
Point Estimate, Sampling Error, and Bias
Point Estimate: A single value calculated from sample data to estimate a population parameter.
Sampling Error: The variability in estimates from sample to sample.
Bias: A systematic tendency to over- or under-estimate the true population value.
Estimating Proportion and Mean
p: Population proportion
p̂: Sample proportion
The sample proportion p̂ is the best point estimate of the population proportion p.
The sample mean x̄ is the best point estimate of the population mean μ.
Example: Estimating Proportion
Suppose a study about global warming finds that 70% of 1501 randomly selected adults in Canada believe in global warming. The sample proportion is 0.7. This is the best point estimate of the population proportion p.
Tabular Data: Sample Proportion and Variance
The following table represents all possible samples of size 2 of a population of 10, 30, and 48, as well as proportions, mean, variance, and standard deviation of each sample.
Sample | Proportion of 10 | Proportion of 30 | Proportion of 48 | Range | Median | Mean | Variance | Standard Deviation |
|---|---|---|---|---|---|---|---|---|
10,10 | 1/2 | 1/2 | 1/2 | 0 | 10 | 10 | 0 | 0 |
30,10 | 1/2 | 1/2 | 1/2 | 20 | 20 | 20 | 100 | 10 |
48,10 | 1/2 | 1/2 | 1/2 | 38 | 29 | 29 | 361 | 19 |
Mean | 1/2 | 1/2 | 1/2 | 19.3 | 19.3 | 19.3 | 263.89 | 11.9 |
Additional info: Table entries inferred for illustration; actual sample values may differ.
Unbiased Estimators
Definition and Properties
An estimator is a statistic used to estimate the value of a population parameter. An unbiased estimator is a statistic whose sampling distribution has a mean equal to the corresponding population parameter.
Sample mean x̄ is an unbiased estimator of population mean μ.
Sample proportion p̂ is an unbiased estimator of population proportion p.
Sample variance s² is an unbiased estimator of population variance σ².
Central Limit Theorem (CLT)
CLT for Sample Means
The Central Limit Theorem states that the sampling distribution of the sample mean x̄ approaches a normal distribution as the sample size n increases, regardless of the population's distribution.
For all samples of size n (n > 30), the sampling distribution of means can be approximated by a normal distribution.
The average (mean) of the sample means is:
The standard error (deviation) of the sample means is:
Probability that sample mean x̄ is less than x:
CLT for Sample Proportion (Success-Failure Condition)
When observations are independent and the sample size is sufficiently large (np ≥ 10 and nq ≥ 10), the sampling distribution of sample proportions p̂ can be approximated by a normal distribution.
The average (mean) of the sample proportions is:
The standard error (SE) of the sample proportions is:
Probability that sample proportion p̂ is less than p:
Sampling Distribution of Sample Means and Proportions
The distribution of sample means (or proportions) with all samples of the same size n taken from the same population will be approximately normal if n is large enough.
Visualizing the CLT
Regardless of the original population distribution (normal, uniform, skewed), the distribution of sample means becomes more normal as sample size increases.
Diagram Description:
Original population: can be normal, uniform, or skewed.
Sample means (n = 10): distribution starts to look more normal.
Sample means (n = 30): distribution is even closer to normal.
Applications of the CLT
Probability Calculations for Sample Means
If x̄ is normally distributed with mean μ and standard deviation σ, for sample size n, the probability that the mean of this sample is less than x is:
Example:
Assume adult males have resting pulse rates that are normally distributed with mean μ = 78 bpm and standard deviation σ = 13.5 bpm. For a random sample of 10 people, what is the probability that the mean resting heart rate is over 83?
The probability that the mean heart rate of subjects in this sample is higher than 83 bpm is about 12.1%.
Probability Calculations for Sample Proportions
If p̂ is normally distributed with mean p and standard error SE, for sample size n, the probability that the sample proportion is less than p is:
Example:
Suppose the proportion of American adults who support the expansion of solar energy is p = 0.88. For a sample of 1000, what is the probability that the sample proportion is at least as extreme as the observed one?
Calculate mean and standard error:
Find probability for :
Therefore, 95.44% of the sample proportions are within 2% of the population proportion p = 0.88.
Summary of Key Formulas
Central Limit Theorem (CLT) for Sample Mean
For all samples of same size n (n > 30):
Average (mean) of sample means:
Standard error:
Probability:
Central Limit Theorem (CLT) for Sample Proportion
If sample is large enough (np ≥ 10 and nq ≥ 10):
Average (mean) of sample proportions:
Standard error:
Probability:
Important Notes
The center of the sampling distribution depends on the underlying distribution of the population, the statistic being considered, the sampling procedure employed, and the sample size used.
The center of the distribution of sample proportions will be p if np ≥ 10 and nq ≥ 10.
The center of the distribution of sample means will be μ if n ≥ 30.