Backweek 1-1 ch 10
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 10: Sampling Distributions
Introduction to Sampling Distributions
Sampling distributions are fundamental in statistics, especially for making inferences about populations based on sample data. This chapter explores how sample statistics, such as proportions and means, vary from sample to sample and how these variations can be modeled mathematically.
Sampling Distribution: The probability distribution of a given statistic based on a random sample.
Importance: Allows estimation of population parameters and quantification of sampling variability.
Learning Objectives
Understand how variations among multiple samples are represented in a sampling distribution.
Calculate the mean and variance of the sampling distribution for proportions and means.
Apply these concepts to estimate population characteristics.
10.1 Modelling Sample Proportions
Understanding Sample Proportions
When repeatedly sampling from a population, the proportion of successes in each sample (denoted as p-hat, or \( \hat{p} \)) will vary. The distribution of these sample proportions forms the sampling distribution of the proportion.
Example: Drawing 1000 samples of size 100 from a population where 20% are under 15 years old (\( p = 0.2 \)).
Each sample yields a value of \( \hat{p} \), and the collection of these values forms a distribution.

Interpretation: The histogram above shows the variability of sample proportions around the true population proportion.
Mean and Standard Deviation of Sample Proportions
Mean of \( \hat{p} \): The average of all sample proportions equals the true population proportion \( p \).
Standard Deviation of \( \hat{p} \): Measures the spread of sample proportions around the mean.
The formulas are:
Mean:
Standard Deviation:

Conditions for Normal Approximation
Sample size must be large enough: and
Samples must be independent and randomly selected.
If sampling without replacement, (10% condition).
10.2 The Sampling Distribution for Proportions
Definition and Properties
The sampling distribution of the proportion is the distribution of sample proportions over many independent samples from the same population. It is typically bell-shaped and centered at the true proportion \( p \).
Standard deviation: where
Sampling error: The difference between a sample proportion and the true population proportion, reflecting expected variability.

Assumptions and Conditions
Independence Assumption: Sampled values must be independent.
Sample Size Assumption: Sample size must be large enough for the normal approximation to hold.
Randomization Condition: Data must be randomly sampled or randomly assigned.
10% Condition: Sample size should be no more than 10% of the population if sampling without replacement.
Success/Failure Condition: Both and should be at least 10.
10.3 The Central Limit Theorem (CLT)
Statement and Importance
The Central Limit Theorem (CLT) is a cornerstone of statistics. It states that the sampling distribution of the sample mean (or proportion) approaches a normal distribution as the sample size increases, regardless of the population's original distribution.
Key Point: The larger the sample size, the better the normal approximation.
Application: Allows use of normal probability models for inference even when the population is not normal.

Sample vs. Sampling Distribution
Distribution of a sample: The pattern of values in a single sample, which may not be normal.
Sampling distribution: The distribution of a statistic (mean or proportion) over many samples, which becomes normal as sample size increases.


10.4 The Sampling Distribution of the Mean
Standard Deviation of the Sample Mean
For a population with mean \( \mu \) and standard deviation \( \sigma \), the sampling distribution of the sample mean \( \bar{y} \) is:
Mean:
Standard deviation:

Cases for Sampling Distribution of the Mean
Case 1: Population normal, \( \sigma \) known:
Case 2: Population normal, \( \sigma \) unknown: with degrees of freedom
Case 3: Population not normal, \( \sigma \) known: if is large enough (CLT applies)


Assumptions and Conditions for Means
Independence assumption
Randomization condition
10% condition
Large enough sample condition (especially for skewed populations)
Example: Probability Calculation for Sample Mean
Suppose box weights are unimodal and symmetric with mean 12 kg and standard deviation 4 kg. For a pallet of 10 boxes, what is the probability the total weight exceeds 150 kg?
Convert to mean: kg
Standard deviation of mean:
Find using the normal distribution:
Probability:

10.5 Standard Error
Definition and Application
When the population standard deviation or proportion is unknown, we estimate the standard deviation of the sampling distribution using sample data. This estimate is called the standard error (SE).
For sample proportion:
For sample mean: where is the sample standard deviation.
Chapter 10 Review and Visual Summary
Connecting Concepts
Sampling distributions allow us to relate population parameters to sample statistics. The Central Limit Theorem justifies the use of normal models for inference, and standard errors provide practical estimates when population parameters are unknown.



10.6 Common Pitfalls
Do not confuse the sampling distribution with the distribution of a single sample.
Ensure independence and randomization in sampling.
Be cautious with small samples, especially for proportions and skewed populations.
Summary Table: Key Formulas
Statistic | Mean | Standard Deviation / Standard Error |
|---|---|---|
Sample Proportion (\( \hat{p} \)) | or | |
Sample Mean (\( \bar{y} \)) | or |