Statistical Inference: Comparing Two Means (Independent and Paired Samples)

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Statistical Inference: Comparing Two Means

Introduction

Statistical inference allows us to draw conclusions about population parameters based on sample data. When comparing two means, we use different methods depending on whether the samples are independent or paired. This guide covers confidence intervals and hypothesis tests for two independent sample means, as well as paired sample analysis.

Key Concepts

Two independent sample means
Confidence interval for two independent sample means
Hypothesis test for two independent sample means
Paired data and paired t-tests
Conditions and assumptions for inference

Types of Data and Appropriate Tests

Choosing the Right Test

Before conducting statistical inference, identify the type of data and the research question. The following table summarizes the main options:

Type of Data	Confidence Interval	Hypothesis Test (Yes/No Question)
Proportions/Percentages	One-proportion z-interval Two-proportion z-interval	One-proportion z-test Two-proportion z-test
Means/Averages	One-sample t-interval Two-sample t-interval	One-sample t-test Two-sample t-test Paired t-test

Comparing Two Independent Means

Example: Turtle Mass

Suppose we want to know if adult male and female turtles have different average masses. Sample data:

Mean mass of 29 female turtles: 1397g (SD = 240g)
Mean mass of 25 male turtles: 1548g (SD = 285g)

Questions:

What can we conclude about the difference in mean mass between males and females?
Is this strong evidence that the average mass of males and females is different?

Confidence Interval for the Difference of Means

A confidence interval estimates the range in which the true difference in population means lies, based on sample data.

Formula:

= sample means
= critical value from t-distribution
= standard error of the difference

= sample standard deviations
= sample sizes

Assumptions and Conditions for Two-Sample t-Interval

Independence Assumption: The two groups must be independent. No repeat measurements on the same individuals/objects/locations.
Randomization Condition: Data should come from a randomized experiment or SRS (Simple Random Sample).
10% Condition: Each sample should be less than 10% of the population.
Normality Assumption: The distribution of each group should be approximately normal. Mild skewness is acceptable, but watch for outliers or multiple modes.

Hypothesis Test for Two Independent Means

Steps in Hypothesis Testing

State hypotheses:
- Null hypothesis: (no difference)
- Alternative hypothesis: (two-tailed), or (one-tailed)
Calculate test statistic:
Find p-value: The probability of observing a test statistic as extreme as, or more extreme than, the observed value under .
Draw conclusion: Compare p-value to significance level (), typically 0.05.

Paired Data and Paired t-Test

When to Use Paired t-Test

Use a paired t-test when data are collected in pairs, such as before-and-after measurements on the same subjects.

Examples: Weight before and after a diet, test scores before and after tutoring, measurements on the same turtles in two different years.

Paired t-Test Procedure

Calculate the difference for each pair:
Compute the mean and standard deviation of the differences: and
Construct a confidence interval for the mean difference:

= number of pairs

Hypothesis test:

Null hypothesis:
Test statistic:

Assumptions for Paired t-Test

Pairs must be independent of each other
Differences should be approximately normally distributed
Data should come from a randomized experiment or SRS

Summary Table: Choosing the Right Test

Scenario	Test/Interval	Data Structure
Compare two independent means	Two-sample t-test / t-interval	Two separate groups
Compare two related means (paired)	Paired t-test / paired t-interval	Pairs of measurements on same subjects

Additional info:

Always check assumptions before performing inference.
Use histograms to visually assess normality and outliers.
Excel and statistical calculators can be used to compute test statistics and confidence intervals.