BackStatistical Inference: Comparing Two Means (Independent and Paired Samples)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Statistical Inference: Comparing Two Means
Introduction
Statistical inference allows us to draw conclusions about population parameters based on sample data. When comparing two means, we use different methods depending on whether the samples are independent or paired. This guide covers confidence intervals and hypothesis tests for two independent sample means, as well as paired sample analysis.
Key Concepts
Two independent sample means
Confidence interval for two independent sample means
Hypothesis test for two independent sample means
Paired data and paired t-tests
Conditions and assumptions for inference
Types of Data and Appropriate Tests
Choosing the Right Test
Before conducting statistical inference, identify the type of data and the research question. The following table summarizes the main options:
Type of Data | Confidence Interval | Hypothesis Test (Yes/No Question) |
|---|---|---|
Proportions/Percentages | One-proportion z-interval Two-proportion z-interval | One-proportion z-test Two-proportion z-test |
Means/Averages | One-sample t-interval Two-sample t-interval | One-sample t-test Two-sample t-test Paired t-test |
Comparing Two Independent Means
Example: Turtle Mass
Suppose we want to know if adult male and female turtles have different average masses. Sample data:
Mean mass of 29 female turtles: 1397g (SD = 240g)
Mean mass of 25 male turtles: 1548g (SD = 285g)
Questions:
What can we conclude about the difference in mean mass between males and females?
Is this strong evidence that the average mass of males and females is different?
Confidence Interval for the Difference of Means
A confidence interval estimates the range in which the true difference in population means lies, based on sample data.
Formula:
= sample means
= critical value from t-distribution
= standard error of the difference
= sample standard deviations
= sample sizes
Assumptions and Conditions for Two-Sample t-Interval
Independence Assumption: The two groups must be independent. No repeat measurements on the same individuals/objects/locations.
Randomization Condition: Data should come from a randomized experiment or SRS (Simple Random Sample).
10% Condition: Each sample should be less than 10% of the population.
Normality Assumption: The distribution of each group should be approximately normal. Mild skewness is acceptable, but watch for outliers or multiple modes.
Hypothesis Test for Two Independent Means
Steps in Hypothesis Testing
State hypotheses:
Null hypothesis: (no difference)
Alternative hypothesis: (two-tailed), or (one-tailed)
Calculate test statistic:
Find p-value: The probability of observing a test statistic as extreme as, or more extreme than, the observed value under .
Draw conclusion: Compare p-value to significance level (), typically 0.05.
Paired Data and Paired t-Test
When to Use Paired t-Test
Use a paired t-test when data are collected in pairs, such as before-and-after measurements on the same subjects.
Examples: Weight before and after a diet, test scores before and after tutoring, measurements on the same turtles in two different years.
Paired t-Test Procedure
Calculate the difference for each pair:
Compute the mean and standard deviation of the differences: and
Construct a confidence interval for the mean difference:
= number of pairs
Hypothesis test:
Null hypothesis:
Test statistic:
Assumptions for Paired t-Test
Pairs must be independent of each other
Differences should be approximately normally distributed
Data should come from a randomized experiment or SRS
Summary Table: Choosing the Right Test
Scenario | Test/Interval | Data Structure |
|---|---|---|
Compare two independent means | Two-sample t-test / t-interval | Two separate groups |
Compare two related means (paired) | Paired t-test / paired t-interval | Pairs of measurements on same subjects |
Additional info:
Always check assumptions before performing inference.
Use histograms to visually assess normality and outliers.
Excel and statistical calculators can be used to compute test statistics and confidence intervals.