Central Limit Theorem, Sampling Distributions, and Statistical Significance: Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Central Limit Theorem (CLT)

Definition and Importance

The Central Limit Theorem (CLT) is a foundational concept in statistics that describes the behavior of sample means. It states that, given a population with a finite mean () and a finite non-zero variance (), the sampling distribution of the mean approaches a normal distribution as the sample size () increases, regardless of the population's original distribution.

Population Mean (): The average value in the population.
Population Variance (): The measure of spread in the population.
Sampling Distribution: The distribution of means from all possible samples of size .

Key Formula:

Mean of sampling distribution:
Variance of sampling distribution:

Equation:

Visualizing the CLT

As sample size increases, the sampling distribution of the mean becomes more normal, even if the population distribution is not normal. This is illustrated by comparing distributions for different sample sizes and shapes (uniform, U-shaped, normal).

Sampling Distributions

Shape and Properties

The sampling distribution of the mean depends on the sample size and the shape of the population distribution. As sample size increases, the sampling distribution becomes more normal and less variable.

Small sample sizes: Sampling distribution may resemble the population distribution.
Large sample sizes: Sampling distribution approaches normality.

Table: Sampling Distributions by Population Shape and Sample Size

Population Shape	Sample Size = 2	Sample Size = 5	Sample Size = 30
Uniform	Uniform-like	Less uniform, more normal	Normal
U-shaped	U-shaped	Less U-shaped, more normal	Normal
Normal	Normal	Normal	Normal

Examples of CLT in Use

Worked Example: Stress Scores for Nurses

Given: Population mean () = 50, standard deviation () = 10, sample size () = 36.

a) Shape of the sample: Approximately normal (by CLT).
b) Theoretical mean of sampling distribution: 50.
c) Standard error:
d) Probability sample mean between 45 and 55:
e) Probability sample mean greater than 48:
f) Probability a single score greater than 48:

Worked Example: Nursing Salaries

Given: Mean = , SD = , sample size = 20.

a) Probability a nurse earns over $100,000:
b) Probability sample mean between and :

Inferential Statistics

Role of CLT in Inference

Inferential statistics use sample data to make conclusions about populations. The CLT allows us to assume normality for the sampling distribution of the mean, enabling hypothesis testing and confidence interval estimation.

Standard Error (SE):
Sample Mean: One of many possible means from the population.

Statistical Significance

Concept and Interpretation

Statistical significance refers to the likelihood that a result or difference between groups is due to something other than random chance. It is quantified by the p-value, which represents the probability of observing the data if the null hypothesis is true.

Significance is only meaningful if other factors are controlled.
Common significance levels: or .

Type I and Type II Errors

Definitions and Table

Statistical tests can make two types of errors:

Type I Error: Rejecting the null hypothesis when it is true (false positive).
Type II Error: Accepting the null hypothesis when it is false (false negative).
Power: The ability of a test to detect a false null hypothesis.

True State of Null Hypothesis	Accept Null	Reject Null
Null is true	Correct decision	Type I Error (Mistake)
Null is false	Type II Error (Missed Opportunity)	Correct decision

How Inference Works

Steps in Statistical Inference

Select representative samples.
Collect relevant data.
Determine if observed differences are due to chance.
Generalize findings to the population.

Deciding What Test to Use

Choosing Statistical Tests

Selection depends on the type of data, number of groups, and research question. Flowcharts can help guide the choice between t-tests, ANOVA, chi-square, etc.

How a Test of Significance Works

Step-by-Step Process

State the null hypothesis.
Set the risk level (significance level).
Select the appropriate test statistic.
Compute the test statistic value.
Determine the critical value for rejection.
Compare obtained value to critical value.
If obtained value is more extreme, reject the null hypothesis.
If not, do not reject the null hypothesis.

Note: Test statistics are often z-scores.

Confidence Intervals

Definition and Calculation

A confidence interval estimates the range in which a population parameter lies, based on sample data.

95% Confidence Interval:
99% Confidence Interval:

Significance Versus Meaningfulness

Interpreting Results

Statistical significance does not always imply practical importance.
Context matters for interpreting significance.
Sample size affects significance; larger samples can detect smaller effects.

Comparing Z Scores and Z Values

Score vs. Sampling Distribution

Z score: For a particular value :
Z value (sampling distribution): For a sample mean : Where

Application: Use the Z score of a sampling distribution for the Z test.

*Additional info: Some explanations and table entries have been expanded for clarity and completeness based on standard statistical knowledge.*