When conducting a one-way ANOVA test, rejecting the null hypothesis indicates that at least one group mean differs from the others. However, this result does not specify which means are different. To identify the specific pairs of means that differ, a follow-up procedure called a post hoc test is used. One common post hoc test is the Bonferroni test, which compares pairs of means individually while controlling for the increased risk of Type I error due to multiple comparisons.
The Bonferroni test involves breaking down the groups into all possible pairs and performing a series of two-sample t-tests. For example, if there are three groups (such as grades 10, 11, and 12), the pairs tested would be (10 vs. 11), (11 vs. 12), and (10 vs. 12). Each pair is tested with the null hypothesis that the two means are equal, and the alternative hypothesis that they are not equal, making it a two-tailed test.
Key values needed for the Bonferroni test come from the ANOVA output, including the Mean Square Error (MSE), which represents the variance within groups. This MSE is used as the estimate of variance in the t-test calculations. Additionally, the total sample size (N), the number of groups (k), and the degrees of freedom for error (df = N - k) are essential for determining the test statistics and p-values.
The test statistic for each pair is calculated using the formula:
\[t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{MSE \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}\]>where \(\bar{x}_1\) and \(\bar{x}_2\) are the sample means of the two groups, \(n_1\) and \(n_2\) are their respective sample sizes, and MSE is the mean square error from the ANOVA.
After calculating the t-value, the corresponding p-value is found using the t-distribution with the appropriate degrees of freedom. Since the test is two-tailed, the p-value is doubled. However, because multiple pairwise comparisons increase the chance of falsely detecting a difference (Type I error), the Bonferroni correction adjusts the p-values by multiplying them by the number of comparisons (pairs). Alternatively, the significance level \(\alpha\) can be divided by the number of pairs, but both methods yield the same decision criterion.
For example, with three groups, there are three pairs, so each p-value is multiplied by 3. If the adjusted p-value is less than the original significance level (commonly 0.05), the null hypothesis for that pair is rejected, indicating a significant difference between those two means.
In practice, this method can be tedious due to multiple calculations, but it provides a rigorous way to pinpoint which specific group means differ after an overall ANOVA indicates a difference exists. The Bonferroni test is especially useful when sample sizes are equal, simplifying the calculations, but it remains applicable with unequal sample sizes as well.
