Sampling Distributions, Central Limit Theorem, and Confidence Intervals in Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Mean Sampling Distribution

Introduction to Sampling Distributions

Sampling distributions are fundamental in statistics for understanding how sample statistics (such as the mean) behave when drawn repeatedly from a population. The sampling distribution of the mean is the probability distribution of all possible sample means for samples of a given size from a population.

Definition: The sampling distribution of the mean is the distribution of sample means from all possible samples of a fixed size.
Key Point: While a single sample mean can vary, the distribution of many sample means tends to be consistent and predictable.
Example: Estimating the average number of pets owned by households by taking multiple samples and comparing the distribution of sample means.

Distribution of Random Variable vs. Distribution of Sample Means

Comparing the distribution of a random variable to the distribution of sample means helps illustrate the concept of sampling variability.

Random Variable Distribution: Shows the spread of individual data points in a sample.
Sample Means Distribution: Shows the spread of means from multiple samples, which is typically less variable than the distribution of individual data points.

Sample	Sample Means
Sample 1, Sample 2, Sample 3	4.05, 2.15, 2.30, 2.00, 3.10

Central Limit Theorem (CLT)

The Central Limit Theorem is a cornerstone of inferential statistics. It states that, for any random variable X, as the sample size n increases, the sampling distribution of the sample mean approaches a normal distribution, regardless of the shape of the population distribution.

Key Point: The larger the sample size, the closer the sampling distribution of the mean is to normal.
Visual Example: Histograms show that as n increases (n = 5, 30, 100), the distribution of sample means becomes more normal.

Formula:

For a population with mean and standard deviation , the sampling distribution of the mean for samples of size n has:

Mean:
Standard deviation (Standard Error):

Practice Applications

Identifying sample size n from a given scenario.
Using CLT to make the sampling distribution closer to normal by increasing sample size.

Central Limit Theorem: Calculations and Probabilities

Using CLT for Probability Calculations

Once the sampling distribution is approximately normal, we can use z-scores to calculate probabilities for sample means.

Z-score Formula:
Application: Find the probability that a sample mean falls below or above a certain value.
Example: Rolling a die 30 times, repeating the process 50 times, and calculating the probability that the sample mean is less than 2.5.

Practice Problems

Given , , and n = 60, find the probability that .
Given a movie rating scenario, use the z-score formula to find the probability that the average rating from a sample exceeds a certain value.

Using the Normal Distribution to Approximate Binomial Probabilities

Normal Approximation to the Binomial

When the number of trials n is large and both np and nq are at least 5, the binomial distribution can be approximated by a normal distribution.

Binomial Formula:
Normal Approximation: , where and
Continuity Correction: When using the normal approximation, adjust for discrete data by adding or subtracting 0.5.

Binomial	Normal

Example Applications

Estimating the probability that more than 60 out of 100 voters support a candidate with p = 0.56.
Calculating probabilities for proportions in consumer preference studies.
Using normal approximation for car recall probabilities in a sample of 76 cars.

Sampling Distribution of Sample Proportion

Distribution of Sample Proportion

For binomial variables, the sample proportion is the ratio of successes to total trials. The sampling distribution of $\hat{p}$ is approximately normal when both np and nq are at least 5.

Mean:
Standard Deviation:
Example: Flipping a coin 10 times, calculating the mean and standard deviation of the sample proportion of heads.

Constructing Confidence Intervals for Proportions

Confidence Intervals for Proportions

Confidence intervals estimate the range in which the true population proportion is likely to fall, based on sample data.

Margin of Error Formula:
Confidence Interval:
Steps to Construct a Confidence Interval:
1. Verify number of successes and failures are at least 5.
2. Find the critical value .
3. Calculate the margin of error.
4. Find the upper and lower bounds.
Example: From a survey of 200 people, 90 preferred Brand A. Construct a 90% confidence interval for the true proportion.

Step	Description
1	Verify # of successes ≥ 5, # of failures ≥ 5
2	Find critical value
3	Margin of error
4	Find upper & lower bounds:

Practice Applications

Calculating the margin of error for a 98% confidence interval for class attendance.
Constructing a 98% confidence interval for the true proportion of time a student is late.

Additional info: These notes expand on the brief points and diagrams in the original slides, providing definitions, formulas, and step-by-step procedures for key concepts in sampling distributions, the Central Limit Theorem, normal approximation to the binomial, and confidence intervals for proportions.