Sampling Distributions: Concepts, Properties, and Applications

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 7: Sampling Distributions

Introduction to Sampling Distributions

Sampling distributions are a foundational concept in statistics, describing the behavior of sample statistics (such as means or proportions) when samples are repeatedly drawn from a population. Understanding sampling distributions is essential for statistical inference, allowing us to estimate population parameters and quantify uncertainty.

Probability Distributions

Definition and Examples

Probability Distribution: A probability distribution lists all possible outcomes of a random process and their associated probabilities.
Population Distribution: Represents the distribution of all subjects in the population. For example, the distribution of heights among British males or the waiting time in a random process.
Example: In an election, if 538 out of 1000 subjects vote for candidate A, the probability distribution for voting is:

Outcome x	P(x)
1 (Vote for A)	0.538
0 (Not vote for A)	0.462
Total	1.000

Continuous Probability Distributions: These are represented by curves, such as the normal distribution for heights or the exponential distribution for waiting times.

Sampling Distributions

Concept and Importance

Sampling Distribution: The probability distribution of a statistic (e.g., sample mean, sample proportion) computed from a random sample of size n from a population.
Each sample drawn from the population yields a different value of the statistic, resulting in a distribution of possible values.
Sampling distributions allow us to assess the variability and reliability of sample statistics.
Statistical Inference: The process of using sample data to estimate population parameters (unknown values).

Random Samples and Sample Proportions

When many random samples of size n are taken from a population, the sample proportion (denoted as p̂) varies from sample to sample.

Trial	Outcome
p̂1	0.512
p̂2	0.489
p̂3	0.519
...	...
p̂1000	0.492

The sample proportion is a random variable because its value depends on the sample selected.

Properties of Sampling Distributions

The sampling distribution of a statistic describes the most probable values of the statistic when taking random samples from the population.
Sampling distributions tell us how likely it is to observe a particular value of the statistic.
Key Theorem: The properties of sampling distributions only hold for random, representative samples. Convenience samples may lead to biased inferences.

Sampling Distribution of the Sample Proportion (p̂)

Mean and Standard Deviation

For a sample of size n from a population with proportion of interest p, the sampling distribution for the sample proportion has:

Mean:
Standard Deviation:

Example: If and , then:

Normal Approximation

For large enough n, the sampling distribution of p̂ approximately follows a normal distribution centered at the true population proportion.
General Form:
Example: For , ,

Effect of Sample Size on Sampling Distribution

Variability and Centering

As sample size increases, the sampling distribution becomes less variable (smaller standard deviation).
Sample statistics (such as p̂) are centered around the true population parameter.

Illustrative Example

Drawing 100 samples of size 100 from a population with yields sample proportions that vary but center around 0.50.
Drawing 1000 samples of size 1015 from the same population yields sample proportions with much less variability.

Unbiased Estimators and the 95% Rule

Unbiasedness

The sample proportion p̂ is an unbiased estimator of p because .
The expected value of an unbiased estimator equals the true population parameter.

Confidence Intervals and the 95% Rule

If sample statistics follow a normal distribution, we can be confident that the population parameter falls within a narrow interval around the sample statistic.
95% Rule: Approximately 95% of sample statistics will fall within two standard deviations of the mean.
Example: For , , the interval is

Sampling Distribution of the Sample Mean (X̄)

Mean and Standard Error

The mean of the sampling distribution of the sample mean is equal to the population mean ().
The standard deviation of the sampling distribution (standard error) is:

Central Limit Theorem (CLT)

The CLT states that for a sample of size n from a population with mean and standard deviation , as n increases, the sampling distribution of the sample mean approaches a normal distribution, regardless of the population's shape.
General Rule: For large n,

Examples

Example 1: The average time for a 100-meter freestyle is 55 seconds (), , . Interval with 68% probability:
Example 2: For roller coasters, , , .

Summary Table: Key Formulas

Statistic	Mean	Standard Deviation (Error)
Sample Proportion (p̂)
Sample Mean (X̄)

Assumptions and Conditions

Sampling distribution properties hold only for random, representative samples.
If assumptions are violated (e.g., non-random samples), results may be biased and not generalizable.

Applications and Practice Problems

Estimating proportions of students attending games, voting in elections, or other population parameters using random samples.
Calculating probabilities and confidence intervals for sample statistics using the normal approximation and standard error formulas.

Additional info: Some examples and tables were expanded for clarity and completeness. The notes cover the essential concepts and formulas for sampling distributions, unbiased estimators, and the central limit theorem, as relevant to a college statistics course.