BackEstimating Parameters and Sampling Distributions: Study Notes xxx
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Parameter Estimation and Sampling Distributions
Introduction
This study guide covers the foundational concepts of parameter estimation, sampling distributions, and the Central Limit Theorem, which are essential topics in college-level statistics. Understanding these topics enables students to make inferences about populations based on sample data and to quantify the uncertainty associated with such estimates.
Parameter Estimation
Definitions and Concepts
Parameter: A numerical value that describes a characteristic of a population (e.g., population mean μ, population proportion p). Parameters are typically unknown constants.
Statistic: A numerical value calculated from a sample, used to estimate a population parameter. Statistics are known but variable, as they change from sample to sample.
Point Estimation
We use sample statistics as point estimates for unknown population parameters.
Examples: Estimating the mean height of adult males in the US, or the proportion of trees affected by infestation in a forest.
Sampling variability: Sample statistics will vary from sample to sample.
Quantifying this variability allows us to estimate the margin of error associated with a point estimate.
Margin of Error
The margin of error quantifies the uncertainty in an estimate due to sampling variability.
Example: If a survey finds that 41% of young adults are affected by recession with a margin of error of ±2.9%, we are 95% confident that the true proportion is between 38.1% and 43.9%.
Sampling Statistics
Random Variables and Probability Distributions
Statistics such as the sample mean (\bar{X}) are random variables because their values vary from sample to sample.
These statistics have probability distributions associated with them, known as sampling distributions.
Sampling Distribution
Definition
The sampling distribution of a statistic is the probability distribution of all possible values of the statistic computed from samples of a given size n.
The sampling distribution of the sample mean (\bar{X}) is the probability distribution of all possible sample means from samples of size n drawn from a population with mean μ and standard deviation σ.
Factors Affecting Sampling Distribution
Sample size (n): Larger samples tend to produce sampling distributions with less variability.
Sampling design: Whether samples are drawn with or without replacement, and whether order matters.
Procedure for Constructing a Sampling Distribution (Small N and n)
Specify the sample size n and the sampling design (e.g., simple random sampling with or without replacement).
List all possible samples of size n and their probabilities.
Compute the value of the sample statistic for each sample.
Compute the probability for each value of the statistic.
Example: Sampling Distribution of Sample Mean
Suppose a population consists of three values: 0, 6, and 9, each with probability 1/3. For samples of size 2 (with replacement, order matters):
Sample | Value of \( \bar{X} \) | Probability |
|---|---|---|
(0,0) | 0 | 1/9 |
(0,6), (6,0) | 3 | 2/9 |
(0,9), (9,0) | 4.5 | 2/9 |
(6,6) | 6 | 1/9 |
(6,9), (9,6) | 7.5 | 2/9 |
(9,9) | 9 | 1/9 |
Sampling Variability of a Statistic
Effect of Sample Size
As sample size n increases, the sampling variability (standard deviation of the sampling distribution) decreases.
Example: Sample means from samples of size 100 are less variable than those from samples of size 6.
Properties of the Sampling Mean
Unbiasedness and Standard Error
Unbiased estimator: The expected value of the sample mean equals the population mean:
Standard error (SE): The standard deviation of the sample mean is:
Distribution of Sample Mean: Normal Population
Example
If the population is normally distributed (e.g., weights of pennies with grams, grams), the sampling distribution of the sample mean for samples of size is also normal.
The mean of the sample means is equal to the population mean, and the standard error is smaller than the population standard deviation.
Increasing sample size further reduces the standard error.
Sampling from a Non-Normal Population
Simulation and Central Limit Theorem
Even if the population distribution is not normal (e.g., number of people in US households), the sampling distribution of the sample mean becomes approximately normal as sample size increases.
For , the distribution of sample means is skewed right; for , it is less skewed; for , it is approximately normal.
Standard error calculations:
:
:
:
Central Limit Theorem (CLT)
Statement and Implications
Central Limit Theorem: Regardless of the shape of the population, the sampling distribution of the sample mean becomes approximately normal as sample size n increases.
Symbolically: (for large n)
If the population distribution is unknown or not normal, the sample mean is approximately normal for .
Summary Table: Shape, Center, and Spread of Sampling Distribution
Shape, Center, and Spread of the Population | Distribution of the Sample Mean |
|---|---|
Population is normal with mean μ and standard deviation σ | Shape: Normal Center: Spread: |
Population is not normal with mean μ and standard deviation σ | Shape: Approximately normal for large n Center: Spread: |
Z-Transform of Sampling Mean
Standardization
To compute probabilities for the sample mean, use the Z-transform:
Worked Examples
Sampling Mean of Normal Population
Given: , ,
Mean of sample mean:
Standard error:
Sampling distribution:
Probability calculation: Use standard normal table to find probability.
Sampling Mean of Unknown Population
Given: , ,
Mean of sample mean:
Standard error:
By CLT,
Probability calculation: Use standard normal table to find probability.
Key Takeaways
Sample statistics are random variables with their own probability distributions.
The sample mean is an unbiased estimator of the population mean.
The standard error quantifies the variability of the sample mean.
The Central Limit Theorem ensures that, for large samples, the sampling distribution of the sample mean is approximately normal, regardless of the population's shape.
Additional info: These notes expand on the original lecture slides by providing full definitions, formulas, and context for each concept, as well as worked examples and summary tables for clarity.