Analysis of Variance (ANOVA): Comparing Multiple Means

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Analysis of Variance (ANOVA)

Why Compare More Than Two Means?

When analyzing data from experiments with more than two groups, it is inefficient and statistically problematic to perform multiple pairwise t-tests. This increases the risk of Type I error (false positives). ANOVA provides a systematic way to test for differences among group means while controlling the overall error rate.

Key Point: Multiple t-tests inflate the probability of making at least one Type I error. For example, with 5 groups, there are 10 possible pairwise comparisons, and the probability of at least one false positive increases with the number of tests.
Example: For 5 groups, the number of pairwise comparisons is .
Type I Error Rate: If each test is performed at , the probability of making at least one Type I error across all tests is .

What is ANOVA?

Analysis of Variance (ANOVA) is a statistical method used to compare the means of three or more independent groups. It tests the null hypothesis that all group means are equal against the alternative that at least one mean differs.

Null Hypothesis ():
Alternative Hypothesis (): At least one mean differs

Between- and Within-Group Variation

ANOVA partitions the total variability in the data into two components:

Between-group variability: Variability due to differences among group means.
Within-group (error) variability: Variability of observations within each group.

The total sum of squares is:

Mean squares are calculated by dividing each sum of squares by its degrees of freedom:

The F statistic is the ratio of these mean squares:

Under , the F statistic follows an distribution with degrees of freedom.

ANOVA Table

The results of an ANOVA are usually presented in a table:

Source	Sum of Squares	df	Mean Square	F	p-value
Between					p-value
Error					p-value
Total

Assumptions of ANOVA

Independence of observations
Normality of the response variable within each group
Equality of variances across groups (homogeneity of variance)

If these assumptions are violated, alternative methods such as Welch's ANOVA may be used.

Example: Cholesterol-Lowering Treatments

Suppose four cholesterol-lowering drugs (A, B, C, D) are tested. After 6 weeks, the reduction in LDL cholesterol is measured. ANOVA is used to test if the mean reduction differs among the drugs. The F statistic and p-value from the ANOVA table indicate whether at least one drug differs significantly from the others.

Recap Table

Keyword/Concept	Definition
One-way ANOVA	A procedure for testing whether the means of three or more independent groups are equal.
Between-group variability	Variability due to differences among group means.
Within-group (error) variability	Variability of observations within each group.
F statistic	The ratio used to test that all group means are equal.
Assumptions	Independence, normality of each group, and equality of variances across groups.

Post-hoc Comparisons

Why Post-hoc Tests?

ANOVA tells us if at least one group mean differs, but does not specify which groups are different. Post-hoc tests are used to identify which means differ while controlling the overall probability of making a Type I error.

Common Post-hoc Methods

Fisher's Least Significant Difference (LSD): Performs unadjusted two-sample t-tests for each pair of means, but only after the omnibus F test is significant. Simple but does not control the family-wise error rate when many comparisons are made.
Bonferroni: Divides the desired significance level by the number of comparisons. Very conservative with many comparisons.
Scheffé's Method: Constructs simultaneous confidence intervals for all possible contrasts among group means. Most conservative, suitable for complex hypotheses involving multiple groups.

Tukey's Honestly Significant Difference (HSD)

Tukey's HSD is a widely used post-hoc procedure that balances power and control of the family-wise error rate. It is based on the studentized range distribution and is appropriate when group sizes are equal.

The critical difference for comparing means is:

Where is the critical value from the studentized range distribution, is from the ANOVA table, and is the group size.
A pair of means and is significantly different if .

Example: Plant Growth Under Different Fertilizers

Suppose an experiment measures the dry weight of plants grown under three fertilizers. ANOVA yields a significant F statistic, indicating not all mean weights are equal. Tukey's HSD is then used to determine which pairs of fertilizers differ significantly.

Recap Table

Keyword/Concept	Definition
Post-hoc test	A procedure for comparing pairs of group means after an ANOVA indicates that not all means are equal, while controlling the family-wise error rate.
Tukey's HSD	Uses the studentized range distribution to calculate simultaneous confidence intervals for all pairwise differences; exact for equal group sizes and less conservative than Bonferroni.

Summary

ANOVA is used to test for differences among three or more group means.
It controls the overall Type I error rate compared to multiple t-tests.
Post-hoc tests, such as Tukey's HSD, are used to identify which means differ after a significant ANOVA result.