BackCentral Limit Theorem, Sampling Distributions, and Statistical Significance: Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Central Limit Theorem (CLT)
Definition and Importance
The Central Limit Theorem (CLT) is a foundational concept in statistics that describes the behavior of sample means. It states that, given a population with a finite mean () and a finite non-zero variance (), the sampling distribution of the mean approaches a normal distribution as the sample size () increases, regardless of the population's original distribution.
Population Mean (): The average value in the population.
Population Variance (): The measure of spread in the population.
Sampling Distribution: The distribution of means from all possible samples of size .
Key Formula:
Mean of sampling distribution:
Variance of sampling distribution:
Equation:
Visualizing the CLT
As sample size increases, the sampling distribution of the mean becomes more normal, even if the population distribution is not normal. This is illustrated by comparing distributions for different sample sizes and shapes (uniform, U-shaped, normal).
Sampling Distributions
Shape and Properties
The sampling distribution of the mean depends on the sample size and the shape of the population distribution. As sample size increases, the sampling distribution becomes more normal and less variable.
Small sample sizes: Sampling distribution may resemble the population distribution.
Large sample sizes: Sampling distribution approaches normality.
Table: Sampling Distributions by Population Shape and Sample Size
Population Shape | Sample Size = 2 | Sample Size = 5 | Sample Size = 30 |
|---|---|---|---|
Uniform | Uniform-like | Less uniform, more normal | Normal |
U-shaped | U-shaped | Less U-shaped, more normal | Normal |
Normal | Normal | Normal | Normal |
Examples of CLT in Use
Worked Example: Stress Scores for Nurses
Given: Population mean () = 50, standard deviation () = 10, sample size () = 36.
a) Shape of the sample: Approximately normal (by CLT).
b) Theoretical mean of sampling distribution: 50.
c) Standard error:
d) Probability sample mean between 45 and 55:
e) Probability sample mean greater than 48:
f) Probability a single score greater than 48:
Worked Example: Nursing Salaries
Given: Mean = , SD = , sample size = 20.
a) Probability a nurse earns over $100,000:
b) Probability sample mean between and :
Inferential Statistics
Role of CLT in Inference
Inferential statistics use sample data to make conclusions about populations. The CLT allows us to assume normality for the sampling distribution of the mean, enabling hypothesis testing and confidence interval estimation.
Standard Error (SE):
Sample Mean: One of many possible means from the population.
Statistical Significance
Concept and Interpretation
Statistical significance refers to the likelihood that a result or difference between groups is due to something other than random chance. It is quantified by the p-value, which represents the probability of observing the data if the null hypothesis is true.
Significance is only meaningful if other factors are controlled.
Common significance levels: or .
Type I and Type II Errors
Definitions and Table
Statistical tests can make two types of errors:
Type I Error: Rejecting the null hypothesis when it is true (false positive).
Type II Error: Accepting the null hypothesis when it is false (false negative).
Power: The ability of a test to detect a false null hypothesis.
True State of Null Hypothesis | Accept Null | Reject Null |
|---|---|---|
Null is true | Correct decision | Type I Error (Mistake) |
Null is false | Type II Error (Missed Opportunity) | Correct decision |
How Inference Works
Steps in Statistical Inference
Select representative samples.
Collect relevant data.
Determine if observed differences are due to chance.
Generalize findings to the population.
Deciding What Test to Use
Choosing Statistical Tests
Selection depends on the type of data, number of groups, and research question. Flowcharts can help guide the choice between t-tests, ANOVA, chi-square, etc.
How a Test of Significance Works
Step-by-Step Process
State the null hypothesis.
Set the risk level (significance level).
Select the appropriate test statistic.
Compute the test statistic value.
Determine the critical value for rejection.
Compare obtained value to critical value.
If obtained value is more extreme, reject the null hypothesis.
If not, do not reject the null hypothesis.
Note: Test statistics are often z-scores.
Confidence Intervals
Definition and Calculation
A confidence interval estimates the range in which a population parameter lies, based on sample data.
95% Confidence Interval:
99% Confidence Interval:
Significance Versus Meaningfulness
Interpreting Results
Statistical significance does not always imply practical importance.
Context matters for interpreting significance.
Sample size affects significance; larger samples can detect smaller effects.
Comparing Z Scores and Z Values
Score vs. Sampling Distribution
Z score: For a particular value :
Z value (sampling distribution): For a sample mean : Where
Application: Use the Z score of a sampling distribution for the Z test.
*Additional info: Some explanations and table entries have been expanded for clarity and completeness based on standard statistical knowledge.*