Two-Sample Tests and One-Way ANOVA: Comparing Means, Proportions, and Variances

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Two-Sample Tests and One-Way ANOVA

Introduction

This chapter covers statistical methods for comparing the means, proportions, and variances of two or more populations. These methods are essential for business decision-making, allowing analysts to determine if observed differences are statistically significant or due to random variation.

Two-Sample Tests

Overview of Two-Sample Tests

Population Means, Independent Samples: Compare means from two unrelated groups (e.g., Group 1 vs. Group 2).
Population Means, Related Samples: Compare means from the same group before and after treatment (paired or matched samples).
Population Proportions: Compare proportions from two groups (e.g., Proportion 1 vs. Proportion 2).
Population Variances: Compare variances from two groups (e.g., Variance 1 vs. Variance 2).

Difference Between Two Means

Independent Samples

To test hypotheses or form confidence intervals for the difference between two population means (), use:

Pooled-Variance t Test: When population variances are unknown but assumed equal.
Separate-Variance t Test: When population variances are unknown and not assumed equal.

The point estimate for the difference is .

Assumptions for Independent Samples

Samples are randomly and independently drawn.
Populations are normally distributed or both sample sizes are at least 30.
For pooled-variance t test: Population variances are assumed equal.
For separate-variance t test: Population variances are not assumed equal.

Hypothesis Tests for Two Population Means

Lower-tail test: vs.
Upper-tail test: vs.
Two-tail test: vs.

Reject if the test statistic falls in the critical region determined by the significance level .

Pooled-Variance t Test

Pooled Variance:
Test Statistic:
Degrees of freedom:

Confidence Interval for (Pooled Variance)

Where is the critical value from the t-distribution with degrees of freedom.

Example: Pooled-Variance t Test

Sales Location	Sample Mean ()	Sample Variance ()	n
Special Front	246.4	42.5420	10
In-Aisle	202.3	32.5271	10

Test statistic:
Critical value at :
Decision: Reject ; there is evidence of a difference in means.

Separate-Variance t Test

Used when population variances are unknown and not assumed equal.
Test statistic and degrees of freedom are calculated using software.
Example: Comparing dividend yields between NYSE and NASDAQ stocks.

Related Populations: The Paired Difference Test

Paired Samples

Used for matched or paired samples (e.g., before/after measurements).
Eliminates variation among subjects by focusing on differences within pairs.
Assumptions: Differences are normally distributed or sample size is large.

Test Statistic for Paired Difference

Let be the difference for pair .
Sample mean of differences:
Sample standard deviation:
Test statistic:
Degrees of freedom:

Confidence Interval for Paired Difference

Example: Paired Difference Test

Item	Costco	Walmart	Difference
Chicken Broth	5.98	5.88	0.10
Ice Cream	8.59	7.19	1.40
Dishwasher Detergent	9.00	17.00	-8.00
Laundry Detergent	11.00	12.00	-1.00
Paper Towels	1.47	2.09	-0.62
Toilet Paper	12.00	27.00	-15.00
Facial Tissues	1.23	1.12	0.11

Two Population Proportions

Testing the Difference Between Proportions

Goal: Test hypothesis or form a confidence interval for .
Assumptions:
Pooled estimate for overall proportion:
Test statistic:

Hypothesis Tests for Two Proportions

Lower-tail test: vs.
Upper-tail test: vs.
Two-tail test: vs.

Confidence Interval for Two Proportions

Comparing Two Population Variances

F Test for Equality of Variances

Hypotheses: vs.
Test statistic: (larger variance in numerator)
Degrees of freedom: ,
Compare calculated to critical value from F-distribution table.

One-Way Analysis of Variance (ANOVA)

Purpose and Design

Used to compare means of three or more groups.
Assumptions: Populations are normally distributed, have equal variances, and samples are randomly and independently selected.
Completely randomized design: Subjects are randomly assigned to groups.

Hypotheses for One-Way ANOVA

Null hypothesis (): All population means are equal ().
Alternative hypothesis (): At least one population mean is different.

Partitioning the Variation

Total Sum of Squares (SST): Total variation among all data values.
Sum of Squares Among Groups (SSA): Variation among group means.
Sum of Squares Within Groups (SSW): Variation within each group.
Relationship:

Formulas

Mean Squares and F Statistic

Mean Square Among:
Mean Square Within:
F Statistic:
Degrees of freedom: ,

Interpreting the F Statistic

If is greater than the critical value from the F-distribution, reject .
Conclusion: At least one group mean is different.

Assumptions for ANOVA

Randomness and independence of samples.
Normality of populations.
Homogeneity of variances (can be tested with Levene's Test).

When Assumptions Are Violated

If only normality is violated: Use Kruskal-Wallis rank test.
If only equal variance is violated: Use separate-variance procedures.
If both are violated: Data transformation is needed.

Levene's Test for Homogeneity of Variance

Tests whether group variances are equal.
Null hypothesis: All group variances are equal.
Procedure: Compute absolute deviations from group medians and perform ANOVA on these values.

Post-Hoc Comparisons: Tukey-Kramer Procedure

Used after a significant ANOVA F test to determine which means differ.
Compares absolute mean differences to a critical range based on the studentized range distribution.

Summary Table: One-Way ANOVA

Source	Sum of Squares	Degrees of Freedom	Mean Square	F
Among Groups	SSA	c-1	MSA	MSA/MSW
Within Groups	SSW	n-c	MSW
Total	SST	n-1

Chapter Summary

Compared means and proportions of two independent populations.
Compared means of two related populations.
Compared variances of two independent populations.
Compared means and variances of more than two populations using ANOVA.