Sampling Distributions, Central Limit Theorem, and Confidence Intervals for Proportions

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Mean Sampling Distribution

Introduction to Sampling Distributions

In statistics, the sampling distribution of a statistic is the probability distribution of that statistic based on a random sample. The sampling distribution of the sample mean, \( \bar{X} \), is especially important for making inferences about population means.

Sample Mean: The mean calculated from a single sample may vary, but the distribution of sample means from many samples forms a predictable pattern.
Sampling Distribution of \( \bar{X} \): The distribution of the sample mean from all possible samples of a given size from a population.

Example: To estimate the average number of pets owned by households in America, a researcher can take multiple samples of 30 Americans, calculate the mean for each sample, and analyze the distribution of these sample means. This approach provides a better approximation of the population mean than a single sample.

Properties of the Sampling Distribution of the Mean

The mean of the sampling distribution of \( \bar{X} \) is equal to the population mean \( \mu \).
The standard deviation of the sampling distribution (standard error) is \( \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \), where \( \sigma \) is the population standard deviation and \( n \) is the sample size.

Central Limit Theorem (CLT)

The Central Limit Theorem states that for any random variable \( X \) with mean \( \mu \) and standard deviation \( \sigma \), as the sample size \( n \) increases, the sampling distribution of \( \bar{X} \) approaches a normal distribution, regardless of the shape of the population distribution.

For small \( n \), the sampling distribution may not be normal.
For large \( n \) (commonly \( n \geq 30 \)), the sampling distribution is approximately normal.
If the population is already normal, the sampling distribution of \( \bar{X} \) is normal for any \( n \).

Formula for Standard Error:

Example: If you roll a die 30 times and repeat this process 50 times, the distribution of the sample means will be approximately normal, even though the distribution of a single die roll is uniform.

Finding Probabilities Using the Sampling Distribution

To find the probability that a sample mean falls within a certain range, use the z-score formula:

Use z-tables or calculators to find probabilities based on the calculated z-score.

Using the Normal Distribution to Approximate Binomial Probabilities

Normal Approximation to the Binomial

For a binomial distribution with parameters \( n \) (number of trials) and \( p \) (probability of success), the distribution of the number of successes \( X \) can be approximated by a normal distribution when both \( np \geq 5 \) and \( nq \geq 5 \) (where \( q = 1 - p \)).

Mean:
Standard deviation:

To use the normal approximation:

Check that \( np \geq 5 \) and \( nq \geq 5 \).
Apply a continuity correction by adjusting the binomial probability by 0.5 units when converting to the normal distribution.
Calculate the z-score:

Example: If 56% of voters support a candidate, use the normal approximation to estimate the probability that more than 60 out of 100 randomly selected voters support the candidate.

Continuity Correction Table

Binomial	Normal

Sampling Distribution of Sample Proportion

Sample Proportion and Its Distribution

For a binomial experiment, the sample proportion \( \hat{p} \) is the proportion of successes in the sample. The sampling distribution of \( \hat{p} \) describes the distribution of sample proportions from all possible samples of a given size.

Mean:
Standard deviation:
The sampling distribution of \( \hat{p} \) is approximately normal if \( np \geq 5 \) and \( nq \geq 5 \).

Example: If you flip a coin 10 times and repeat this process many times, the distribution of the proportion of heads in each sample will be approximately normal if the sample size is large enough.

Constructing Confidence Intervals for Proportions

Confidence Intervals for a Population Proportion

A confidence interval estimates the range in which the true population proportion is likely to fall, based on a sample proportion and a specified confidence level.

Margin of error:
Confidence interval:

Steps to Construct a Confidence Interval:

Verify that the number of successes and failures are both at least 5.
Find the critical value \( z_{\alpha/2} \) for the desired confidence level.
Calculate the margin of error \( E \).
Find the upper and lower bounds: and .

Example: From a survey of 200 people, 90 preferred computers from Brand A. To construct a 90% confidence interval for the true proportion, calculate the sample proportion, find the critical value, compute the margin of error, and determine the interval.

Summary Table: Key Formulas

Concept	Formula
Standard error of mean
Standard error of proportion
Z-score for sample mean
Z-score for sample proportion
Margin of error (proportion)

Additional info: The Central Limit Theorem is foundational for inferential statistics, allowing the use of normal probability methods even when the population distribution is unknown, provided the sample size is sufficiently large. Continuity correction is necessary when using the normal approximation for discrete distributions like the binomial.