BackAnalysis of Variance (ANOVA): One-Way and Two-Way Methods
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Analysis of Variance (ANOVA)
Introduction
Analysis of Variance (ANOVA) is a statistical method used to compare means across multiple groups to determine if at least one group mean is significantly different from the others. ANOVA is widely used in business statistics to analyze experimental data and to test hypotheses about group differences.
General ANOVA Setting
Experimental Design and Factors
Factor: A variable controlled by the investigator, which can have two or more levels (categories or numerical values).
Levels: Different values or categories of a factor, each producing a distinct group.
Each group is considered a sample from a different population.
Dependent Variable: The outcome measured to observe the effect of the factor(s).
Experimental Design: The plan used to collect and assign data to groups.
Completely Randomized Design
One-Way ANOVA
Subjects (experimental units) are randomly assigned to groups, assuming homogeneity.
Only one factor (independent variable) is considered, with two or more levels.
Analysis is performed using one-way ANOVA.
One-Way Analysis of Variance
Purpose and Assumptions
Evaluates differences among the means of three or more groups.
Assumptions:
Populations are normally distributed.
Populations have equal variances.
Samples are randomly and independently selected.
Example: Comparing the number of accidents for different work shifts or expected mileage for different brands of tires.
Hypotheses of One-Way ANOVA
Formulation
Null Hypothesis (): All population means are equal.
Alternative Hypothesis (): At least one population mean is different. Not all of the population means are equal
Rejecting indicates a factor effect, but does not specify which means differ.
Partitioning the Variation
Components of Variation
Total variation in the data is split into two parts:
Formula:
SST (Total Sum of Squares): Aggregate variation of all data values.
SSA (Sum of Squares Among Groups): Variation among group means.
SSW (Sum of Squares Within Groups): Variation within each group.
Calculating Sums of Squares
Total Sum of Squares
Formula:
= number of groups
= number of values in group
= th observation from group
= grand mean (mean of all data values)
Among-Group Variation
Formula:
= mean of group
Mean Square Among:
Within-Group Variation
Formula:
Mean Square Within:
Obtaining the Mean Squares
Mean Square Among (MSA):
Mean Square Within (MSW):
Mean Square Total (MST):
One-Way ANOVA Table
Summary Table
Source of Variation | Degrees of Freedom | Sum Of Squares | Mean Square (Variance) | F |
|---|---|---|---|---|
Among Groups | c - 1 | SSA | ||
Within Groups | n - c | SSW | ||
Total | n - 1 | SST |
One-Way ANOVA F Test Statistic
Test Statistic and Degrees of Freedom
Test Statistic:
Degrees of Freedom:
(numerator, among groups)
(denominator, within groups)
Interpreting the F Statistic
The F statistic is the ratio of the among-group variance estimate to the within-group variance estimate.
The ratio must always be positive.
Decision Rule: Reject if (critical value from F-distribution table).
One-Way ANOVA Example
Golf Club Distance Example
Three golf clubs tested for mean distance using five measurements each.
Data:
Club 1 | Club 2 | Club 3 |
|---|---|---|
254 | 234 | 200 |
263 | 218 | 222 |
241 | 235 | 197 |
237 | 227 | 206 |
251 | 216 | 204 |
Means: , , ,
SSA, SSW, MSA, MSW, and are calculated as shown in the slides.
Decision: Since , reject at .
Conclusion: There is evidence that at least one mean differs from the rest.
Assumptions for ANOVA F Test
Randomness and Independence: Samples must be randomly and independently selected.
Normality: Populations should be normally distributed. The F test is robust to moderate departures from normality, especially with large samples.
Homogeneity of Variance: Group variances should be equal. This can be tested using Levene's Test.
Post-Hoc Analysis: Tukey-Kramer Procedure
Purpose and Steps
Used after rejecting in ANOVA to determine which means are significantly different.
Allows paired comparisons by comparing absolute mean differences to a critical range.
Critical Range Formula: Additional info: The formula may vary slightly depending on sample sizes and the number of groups.
If the absolute mean difference exceeds the critical range, the means are significantly different.
Two-Way ANOVA
Introduction and Assumptions
Used when there are two factors of interest, each with multiple levels.
Assumptions:
Populations are normally distributed.
Populations have equal variances.
Independent random samples are selected.
Sources of Variation in Two-Way ANOVA
Factor A Variation (SSA): Variation due to levels of factor A.
Factor B Variation (SSB): Variation due to levels of factor B.
Interaction Variation (SSAB): Variation due to interaction between factors A and B.
Error Variation (SSE): Random variation within cells.
Total Variation (SST): Sum of all sources of variation.
Two-Way ANOVA Equations
Total Variation:
Factor A Variation:
Factor B Variation:
Interaction Variation:
Error Variation:
Mean Square Calculations
Interpreting Two-Way ANOVA Results
First, test for statistical significance of the interaction effect.
If interaction is significant, focus analysis on the interaction.
If interaction is not significant, focus on main effects (Factor A and Factor B).
Two-Way ANOVA Summary Table
Source of Variation | Degrees of Freedom | Sum of Squares | Mean Square | F Statistic |
|---|---|---|---|---|
Factor A | a - 1 | SSA | ||
Factor B | b - 1 | SSB | ||
Interaction | (a-1)(b-1) | SSAB | ||
Error | ab(n'-1) | SSE | ||
Total | abn' - 1 | SST |
Key Features and Summary
ANOVA is essential for comparing means across multiple groups in business statistics.
One-way ANOVA tests for differences among group means for a single factor; two-way ANOVA extends this to two factors and their interaction.
Assumptions of normality, equal variances, and random sampling are critical for valid inference.
Post-hoc tests like Tukey-Kramer help identify which means differ after a significant ANOVA result.