BackModule 8
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Sampling Distribution, Estimation, and Sample Size
Overview
This module introduces key concepts in elementary statistics relevant to health sciences, focusing on sampling distributions, estimation of population parameters, and sample size determination. Understanding these topics is essential for making valid inferences from sample data to populations.
Probability Distribution vs Sampling Distribution
Probability Distribution
A probability distribution describes how the values of a random variable are distributed. It gives the probability that a random variable takes each possible value.
Example: Consider a box containing marbles labeled 1, 2, 3, and 4. If you pick one marble at random, the probability distribution of the label (X) is uniform: for .
Mean of X:
Variance of X:
Sampling Distribution
A sampling distribution is the probability distribution of a statistic (such as the mean) computed from a sample of a population. It describes the variability of the statistic from sample to sample.
Example: If you pick 2 marbles from the box with replacement, the sample mean is a random variable. There are 16 possible samples, and the sampling distribution of is:
Probability | |
|---|---|
1.0 | 1/16 |
1.5 | 2/16 |
2.0 | 3/16 |
2.5 | 4/16 |
3.0 | 3/16 |
3.5 | 2/16 |
4.0 | 1/16 |
Mean of : (same as population mean)
Variance of :
Standard Error (SE):
Additional info: The standard error quantifies the variability of the sample mean from sample to sample.
Student's t-Distribution and the t-Table
Definition and Properties
The t-distribution is a family of distributions used when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.
Degrees of Freedom (df):
Shape: Symmetric, bell-shaped, but with heavier tails than the normal distribution.
Mean: 0
Variance: Greater than 1 for small samples; approaches 1 as sample size increases.
Formula for t-statistic:
The t-table provides critical values for different confidence levels and degrees of freedom.
Estimation of Population Parameters
Point Estimate and Interval Estimate
Point Estimate: A single value used to approximate a population parameter (e.g., sample mean for population mean ).
Interval Estimate (Confidence Interval): A range of values within which the population parameter is believed to lie, with a certain level of confidence.
Confidence Interval for Mean (when is known):
Confidence Interval for Mean (when is unknown):
Margin of Error (ME):
Additional info: The confidence level (e.g., 95%) indicates the probability that the interval contains the true parameter value in repeated sampling.
Central Limit Theorem (CLT)
Statement and Implications
The Central Limit Theorem states that, regardless of the population's distribution, the sampling distribution of the sample mean approaches a normal distribution as the sample size increases (typically is considered sufficient).
If the population is normal, the sampling distribution of the mean is normal for any sample size.
If the population is not normal, the sampling distribution of the mean is approximately normal for large .
Formula for Standard Error:
Confidence Interval for Proportions
Single Proportion
Point Estimate:
Standard Error:
Confidence Interval:
Difference Between Two Proportions
Point Estimate:
Standard Error:
Confidence Interval:
Sample Size Determination
For Estimating a Mean
For Estimating a Proportion
Use for maximum sample size if is unknown.
For Difference Between Two Means
Additional info: Sample size calculations ensure that the margin of error does not exceed a specified value at a given confidence level.
Summary Table: Key Formulas
Parameter | Point Estimate | Standard Error | Confidence Interval |
|---|---|---|---|
Mean () | or | or | |
Proportion () | |||
Difference of Means () | |||
Difference of Proportions () |
Precision and Accuracy
Definitions
Precision: How narrow the confidence interval is; narrower intervals indicate more precise estimates.
Accuracy: How close the interval is to the true parameter value; higher confidence levels increase accuracy but may widen intervals.
Additional info: Increasing sample size improves both accuracy and precision by reducing standard error and margin of error.