BackSampling Distributions (Chapter 7): Business Statistics Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Sampling Distributions
Introduction to Sampling Distributions
Sampling distributions are a foundational concept in inferential statistics, especially in business applications. They describe the probability distribution of a given statistic based on a random sample drawn from a population.
Sampling Distribution: The distribution of all possible values of a statistic (such as the mean) for a given sample size selected from a population.
Importance: Understanding sampling distributions allows us to make probabilistic statements about sample statistics and to estimate population parameters.
Developing a Sampling Distribution
Population and Sample Setup
Consider a population of size N = 4 with individuals A, B, C, and D.
The random variable X represents the age of individuals: 18, 20, 22, 24 years.
Population Summary Measures
Population Mean (μ):
Population Standard Deviation (σ):
The population distribution is uniform in this example.
All Possible Samples of Size n = 2 (with Replacement)
There are 16 possible samples (since each of the 4 individuals can be selected twice).
Each sample mean is calculated, resulting in a distribution of 16 sample means.
Sampling Distribution of the Sample Means
The distribution of the 16 sample means is not uniform, even though the population is.
This illustrates how the sampling distribution can differ in shape from the population distribution.
Summary Measures for the Sampling Distribution
Mean of Sample Means (μ\bar{X}):
Standard Deviation of Sample Means (σ\bar{X}):
Sampling Distribution of the Sample Mean
Key Properties
The mean of the sampling distribution of the sample mean is denoted by .
The standard deviation of the sampling distribution of the sample mean is called the standard error and is denoted by .
If the Population is Normal
If the population is normal with mean and standard deviation , then the sampling distribution of is also normal with:
Sampling Distribution Properties
The standard error decreases as the sample size increases.
Larger sample sizes yield a narrower (less variable) sampling distribution.
If the Population is Not Normal: The Central Limit Theorem (CLT)
The Central Limit Theorem states that, for large enough sample sizes, the sampling distribution of the sample mean will be approximately normal, regardless of the population's shape.
This allows us to use normal probability methods even when the population distribution is unknown or not normal.
The CLT applies specifically to the distribution of the sample mean, not to all statistics (e.g., sample variance).
Sampling Distribution of the Sample Variance
The sampling distribution of the sample variance is skewed to the right for all sample sizes.
The Central Limit Theorem does not apply to the sampling distribution of the sample variance.
How Large is Large Enough?
For most distributions, n > 30 is sufficient for the sampling distribution of the mean to be nearly normal.
For fairly symmetric distributions, n > 15 may be sufficient.
For normal populations, the sampling distribution of the mean is always normal, regardless of sample size.
Example: Probability Calculation Using the Sampling Distribution
Population mean , standard deviation .
Sample size .
What is ?
Step 1: Identify the sampling distribution The sampling distribution of the sample mean is approximately normal (by CLT).
Step 2: Calculate the mean and standard error
Step 3: Find the probability Probability of observing a value less than 105: NORMDIST(105, 100, 3, 1) = 0.95221 Probability of observing a value greater than 105:
Interpretation: There is a 4.8% chance that the sample mean will be greater than 105.
Summary Table: Key Formulas
Statistic | Population | Sampling Distribution |
|---|---|---|
Mean | ||
Standard Deviation |
Additional info: The Central Limit Theorem is one of the most important results in statistics, as it justifies the use of normal probability models for inference about means, even when the population distribution is unknown or non-normal, provided the sample size is sufficiently large.