Sampling Distributions and the Central Limit Theorem

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Sampling Distributions

Definition and Properties

A sampling distribution is the probability distribution of a sample statistic, such as the sample mean, formed when random samples of size n are repeatedly taken from a population. Each sample statistic (mean, proportion, etc.) has its own sampling distribution.

Sample Mean ($ \overline{x} $): The average value calculated from a sample.
Sampling Distribution of the Sample Mean: The distribution of all possible sample means from samples of size n drawn from the population.

To construct a sampling distribution:

Take a sample of size n from the population and calculate the sample mean.
Repeat this process many times, recording each sample mean.
The collection of these means forms the sampling distribution of the sample mean.

Mean of the Sampling Distribution ($ \mu_{\overline{x}} $): Equal to the population mean ($ \mu $).
Standard Deviation of the Sampling Distribution (Standard Error, $ \sigma_{\overline{x}} $): Equal to the population standard deviation divided by the square root of the sample size:

Variance of the Sampling Distribution:

Example: Sampling Distribution Construction

Suppose the population consists of the values 1, 3, and 5. We take samples of size 3 with replacement.

There are $3^3 = 27$ possible samples.
For each sample, calculate the sample mean.
Tabulate the frequency and probability of each unique sample mean.

Sample Mean ($ \overline{x} $)	Frequency (f)	Probability (P)
1 (3/3)	1	1/27 ≈ 0.0370
5/3	3	3/27 ≈ 0.1111
7/3	6	6/27 ≈ 0.2222
3 (9/3)	7	7/27 ≈ 0.2593
11/3	6	6/27 ≈ 0.2222
13/3	3	3/27 ≈ 0.1111
5 (15/3)	1	1/27 ≈ 0.0370

The mean of the sampling distribution is 3 (same as the population mean).
The variance is approximately 0.8889, and the standard deviation is approximately 0.9428.
These values confirm the theoretical properties:

Example Application: This process verifies that the mean of the sampling distribution equals the population mean, and the standard deviation equals the population standard deviation divided by the square root of the sample size.

The Central Limit Theorem (CLT)

Statement and Interpretation

The Central Limit Theorem is a fundamental result in statistics describing the shape of the sampling distribution of the sample mean.

If random samples of size $ n \geq 30 $ are drawn from any population with mean $ \mu $ and standard deviation $ \sigma $, the sampling distribution of the sample means approximates a normal distribution.
The larger the sample size, the better the approximation.
If the population is normally distributed, the sampling distribution of the sample mean is normal for any sample size $ n $.

Key Properties:

Mean of the sampling distribution: $ \mu_{\overline{x}} = \mu $
Standard deviation (standard error): $ \sigma_{\overline{x}} = \frac{\sigma}{\sqrt{n}} $
Variance: $ \text{Var}(\overline{x}) = \frac{\sigma^2}{n} $

Example: Application of the CLT

Suppose the diameters of fully grown white oak trees are normally distributed with mean 3.5 feet and standard deviation 0.2 feet. Random samples of size 16 are drawn.

Mean of the sampling distribution: 3.5 feet
Standard deviation: $ 0.2 / \sqrt{16} = 0.05 $ feet
Since the population is normal, the sampling distribution is also normal, regardless of sample size.

Graphical Representation: The sampling distribution is a normal curve centered at 3.5, with standard deviations marked at intervals of 0.05 feet.

Applying the Central Limit Theorem: Probability of a Sample Mean

Finding Probabilities Using the Sampling Distribution

To find the probability that a sample mean falls within a certain interval, convert the sample mean to a z-score using the sampling distribution's mean and standard deviation.

Z-score formula for sample means:

This formula is similar to the z-score for individual values, but the denominator is the standard error.

Example: Home Sales Price

Population mean ($ \mu $): $296,700
Population standard deviation ($ \sigma $): $50,000
Sample size ($ n $): 12
Find the probability that the sample mean is more than $275,000.

Calculate the standard error:

Compute the z-score for $275,000:

Find the probability that $ \overline{x} > 275,000 $:
Look up the cumulative probability for z = -1.50 (from standard normal tables): 0.0668
Since we want the probability above this value:

Interpretation: There is a 93.32% chance that the sample mean exceeds $275,000.

Summary Table: Sampling Distribution Properties

Property	Formula	Description
Mean		Equal to the population mean
Standard Deviation (Standard Error)		Population standard deviation divided by square root of sample size
Variance		Population variance divided by sample size
Shape	Normal (if population normal or n large)	Sampling distribution approaches normality as n increases

Additional info: The Central Limit Theorem is foundational for inferential statistics, allowing us to make probability statements about sample means even when the population distribution is unknown, provided the sample size is sufficiently large.