BackTwo-Sample Tests and One-Way ANOVA: Comparing Means, Proportions, and Variances
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Two-Sample Tests and One-Way ANOVA
Introduction
This chapter covers statistical methods for comparing the means, proportions, and variances of two or more populations. These methods are essential for business decision-making, allowing analysts to determine if observed differences are statistically significant or due to random variation.
Two-Sample Tests
Overview of Two-Sample Tests
Population Means, Independent Samples: Compare means from two unrelated groups (e.g., Group 1 vs. Group 2).
Population Means, Related Samples: Compare means from the same group before and after treatment (paired or matched samples).
Population Proportions: Compare proportions from two groups (e.g., Proportion 1 vs. Proportion 2).
Population Variances: Compare variances from two groups (e.g., Variance 1 vs. Variance 2).
Difference Between Two Means
Independent Samples
To test hypotheses or form confidence intervals for the difference between two population means (), use:
Pooled-Variance t Test: When population variances are unknown but assumed equal.
Separate-Variance t Test: When population variances are unknown and not assumed equal.
The point estimate for the difference is .
Assumptions for Independent Samples
Samples are randomly and independently drawn.
Populations are normally distributed or both sample sizes are at least 30.
For pooled-variance t test: Population variances are assumed equal.
For separate-variance t test: Population variances are not assumed equal.
Hypothesis Tests for Two Population Means
Lower-tail test: vs.
Upper-tail test: vs.
Two-tail test: vs.
Reject if the test statistic falls in the critical region determined by the significance level .
Pooled-Variance t Test
Pooled Variance:
Test Statistic:
Degrees of freedom:
Confidence Interval for (Pooled Variance)
Where is the critical value from the t-distribution with degrees of freedom.
Example: Pooled-Variance t Test
Sales Location | Sample Mean () | Sample Variance () | n |
|---|---|---|---|
Special Front | 246.4 | 42.5420 | 10 |
In-Aisle | 202.3 | 32.5271 | 10 |
Test statistic:
Critical value at :
Decision: Reject ; there is evidence of a difference in means.
Separate-Variance t Test
Used when population variances are unknown and not assumed equal.
Test statistic and degrees of freedom are calculated using software.
Example: Comparing dividend yields between NYSE and NASDAQ stocks.
Related Populations: The Paired Difference Test
Paired Samples
Used for matched or paired samples (e.g., before/after measurements).
Eliminates variation among subjects by focusing on differences within pairs.
Assumptions: Differences are normally distributed or sample size is large.
Test Statistic for Paired Difference
Let be the difference for pair .
Sample mean of differences:
Sample standard deviation:
Test statistic:
Degrees of freedom:
Confidence Interval for Paired Difference
Example: Paired Difference Test
Item | Costco | Walmart | Difference |
|---|---|---|---|
Chicken Broth | 5.98 | 5.88 | 0.10 |
Ice Cream | 8.59 | 7.19 | 1.40 |
Dishwasher Detergent | 9.00 | 17.00 | -8.00 |
Laundry Detergent | 11.00 | 12.00 | -1.00 |
Paper Towels | 1.47 | 2.09 | -0.62 |
Toilet Paper | 12.00 | 27.00 | -15.00 |
Facial Tissues | 1.23 | 1.12 | 0.11 |
Two Population Proportions
Testing the Difference Between Proportions
Goal: Test hypothesis or form a confidence interval for .
Assumptions:
Pooled estimate for overall proportion:
Test statistic:
Hypothesis Tests for Two Proportions
Lower-tail test: vs.
Upper-tail test: vs.
Two-tail test: vs.
Confidence Interval for Two Proportions
Comparing Two Population Variances
F Test for Equality of Variances
Hypotheses: vs.
Test statistic: (larger variance in numerator)
Degrees of freedom: ,
Compare calculated to critical value from F-distribution table.
One-Way Analysis of Variance (ANOVA)
Purpose and Design
Used to compare means of three or more groups.
Assumptions: Populations are normally distributed, have equal variances, and samples are randomly and independently selected.
Completely randomized design: Subjects are randomly assigned to groups.
Hypotheses for One-Way ANOVA
Null hypothesis (): All population means are equal ().
Alternative hypothesis (): At least one population mean is different.
Partitioning the Variation
Total Sum of Squares (SST): Total variation among all data values.
Sum of Squares Among Groups (SSA): Variation among group means.
Sum of Squares Within Groups (SSW): Variation within each group.
Relationship:
Formulas
Mean Squares and F Statistic
Mean Square Among:
Mean Square Within:
F Statistic:
Degrees of freedom: ,
Interpreting the F Statistic
If is greater than the critical value from the F-distribution, reject .
Conclusion: At least one group mean is different.
Assumptions for ANOVA
Randomness and independence of samples.
Normality of populations.
Homogeneity of variances (can be tested with Levene's Test).
When Assumptions Are Violated
If only normality is violated: Use Kruskal-Wallis rank test.
If only equal variance is violated: Use separate-variance procedures.
If both are violated: Data transformation is needed.
Levene's Test for Homogeneity of Variance
Tests whether group variances are equal.
Null hypothesis: All group variances are equal.
Procedure: Compute absolute deviations from group medians and perform ANOVA on these values.
Post-Hoc Comparisons: Tukey-Kramer Procedure
Used after a significant ANOVA F test to determine which means differ.
Compares absolute mean differences to a critical range based on the studentized range distribution.
Summary Table: One-Way ANOVA
Source | Sum of Squares | Degrees of Freedom | Mean Square | F |
|---|---|---|---|---|
Among Groups | SSA | c-1 | MSA | MSA/MSW |
Within Groups | SSW | n-c | MSW | |
Total | SST | n-1 |
Chapter Summary
Compared means and proportions of two independent populations.
Compared means of two related populations.
Compared variances of two independent populations.
Compared means and variances of more than two populations using ANOVA.