When conducting an ANOVA test, if the null hypothesis is rejected, it indicates that at least one group mean is different from the others. However, it does not specify which means differ, leading to the necessity of post hoc tests. One such test is the Tukey Kramer test, which allows for pairwise comparisons between group means to identify specific differences.
The Tukey Kramer test operates by comparing each possible pair of means. For instance, with three groups, there are three pairs to evaluate. Although this may seem overwhelming, the process involves a few initial steps that are consistent across all comparisons, making it similar to t-tests previously learned.
To begin, ensure that the null hypothesis from the ANOVA test has been rejected, confirming that there is a difference among the means. Set the significance level (alpha) for the Tukey Kramer test, typically at 0.05. The critical value needed for comparisons is obtained from the studentized range distribution table, also known as the q table. This table is similar to the f table used in ANOVA, but it requires the degrees of freedom, calculated as the total number of observations minus the number of groups. For example, with 30 observations across three groups, the degrees of freedom would be 27. Using these values, the critical value can be determined, which in this case is 3.05.
In the Tukey Kramer test, each pair's test statistic, known as the q statistic, is compared against the critical value. If the q statistic exceeds the critical value, the null hypothesis for that pair is rejected, indicating a significant difference between the means. Conversely, if the q statistic is less than the critical value, the null hypothesis is not rejected, suggesting no significant difference.
The q statistic is calculated using the formula:
\[q = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{MSE}{n_1} + \frac{MSE}{n_2}}}\]
where \(\bar{X}_1\) and \(\bar{X}_2\) are the means of the two groups being compared, \(MSE\) is the mean squares due to error from the ANOVA output, and \(n_1\) and \(n_2\) are the sample sizes of the respective groups.
For example, when comparing the average study times of grades 10 and 11, if the calculated q statistic is 1.949, which is less than the critical value of 3.05, we fail to reject the null hypothesis, concluding that the average study times are the same. In contrast, when comparing grades 10 and 12, if the q statistic is 4.498, which exceeds the critical value, we reject the null hypothesis, indicating a significant difference in average study times.
By systematically applying the Tukey Kramer test to each pair of means, one can effectively determine which specific means differ, providing clarity following the initial ANOVA analysis.