Skip to main content
Back

Regression and One-way ANOVA: Concepts, Interpretation, and Application

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Regression Analysis and Interpretation

Simple Linear Regression

Simple linear regression models the relationship between a quantitative response variable and a single predictor variable. The goal is to estimate how changes in the predictor are associated with changes in the response.

  • Key Terms: Predictor (independent variable), Response (dependent variable), Residuals (differences between observed and predicted values).

  • Model Equation:

  • Interpretation: The coefficient represents the expected change in for a one-unit increase in .

  • Example: Regression of glia-neuron ratio on log(brain mass) for primates.

Assessing Model Fit

Model fit is evaluated using residual plots and summary statistics such as .

  • Residual Plots: Used to check assumptions of normality, constant variance, and independence.

  • Normal Probability Plot: Assesses whether residuals are approximately normally distributed.

  • Histogram of Residuals: Visualizes the distribution of residuals.

  • Residuals vs Fitted Values: Checks for patterns indicating non-constant variance.

  • Residuals vs Order: Checks for independence of residuals.

  • Coefficient of Determination (): Proportion of variance in the response explained by the predictor.

  • Formula:

Confidence Intervals in Regression

Confidence intervals quantify uncertainty in estimated regression parameters or predicted values.

  • Mean Response CI: Interval for the mean predicted value at a given .

  • Individual Response CI: Interval for a single predicted value, accounting for both model and residual variance.

  • Example: Calculating a 95% CI for the predicted glia-neuron ratio for human brain size.

From Regression to ANOVA

Conceptual Link

Regression and ANOVA are both methods for explaining variation in a response variable. Regression uses continuous predictors, while ANOVA compares means across categorical groups.

  • Mean-only Model: Assumes all observations have the same mean.

  • Regression Model: Models mean as a function of predictor.

  • ANOVA Model: Models mean as a function of group membership.

  • Sum of Squares (SS): Quantifies total, explained, and residual variation.

  • Formulas:

One-way ANOVA

Purpose and Hypotheses

One-way ANOVA tests whether the means of three or more groups are equal.

  • Null Hypothesis (): All group means are equal ().

  • Alternative Hypothesis (): At least one group mean is different.

ANOVA Table and Calculations

The ANOVA table summarizes sources of variation, degrees of freedom, sum of squares, mean squares, and the F-statistic.

Source

DF

SS

MS

F

Treatment (Between)

g - 1

SSG

MSG = SSG/(g-1)

MSG/MSE

Error (Within)

N - g

SSE

MSE = SSE/(N-g)

Total

N - 1

SST

Formula for F-statistic:

Effect Size in ANOVA

Effect size quantifies the proportion of variance explained by group differences.

Type of Effect

One-way ANOVA

Simple Linear Regression

Small

0.01

0.01

Medium

0.09

0.09

Large

0.25

0.25

Formula:

Assumptions of ANOVA

Standard ANOVA requires several assumptions for valid inference.

Assumption

Complications & Remedies

Nearly normal distribution within groups

Skewed: use transformation

Equal variance between groups

Unequal: use transformation

No outliers

Many outliers: use non-parametric test (Kruskal-Wallis)

Same sample size in groups

Unbalanced: loss of robustness

Independent samples

Dependent: use repeated measures ANOVA

F-Distribution

The F-distribution is used to determine the significance of the ANOVA F-statistic.

  • Always positive; shape depends on numerator and denominator degrees of freedom.

  • Formula:

  • p-value: Probability of observing an F as large or larger under .

Non-parametric Alternative: Kruskal-Wallis Test

If ANOVA assumptions are violated, the Kruskal-Wallis test can be used to compare medians across groups.

  • Null Hypothesis: All group medians are equal.

  • Test Statistic: Based on ranks rather than means.

Post-hoc Pairwise Comparisons

After a significant ANOVA, post-hoc tests identify which group means differ.

  • Confidence Intervals: If CI for difference does not include zero, groups differ.

  • Adjusted p-values: Control for multiple comparisons.

Reporting and Interpreting Results

Three Things to Consider and Report

  • Direction of Effect: Which means are higher or lower?

  • Size of Effect: Proportion of variance explained (), or standardized effect size.

  • Statistical Significance: Is the difference unlikely due to chance? (Overall F-test, pairwise comparisons)

Examples and Applications

Regression Example: Glia-Neuron Ratio

  • Regression Model:

  • Interpretation: Log-transformed brain mass explains a significant portion of the variation in glia-neuron ratio among primates.

  • Confidence Interval: Used to assess whether human brain is unusual compared to other primates.

ANOVA Example: Mean Cone Size vs Environment

  • One-way ANOVA Table:

Source

DF

SS

MS

F

p-value

Environment

2

29.464

14.732

50.09

0.000

Error

13

3.816

0.294

Total

15

33.280

  • Interpretation: There is a statistically significant difference in mean cone size among environments.

  • Effect Size: (88.6% of variance explained by environment)

ANOVA Assumptions & Remedies

Assumption

Complication

Remedy

Normality within groups

Skewed distributions

Transformation

Equal variance

Unequal variance

Transformation

No outliers

Many outliers

Non-parametric test

Same sample size

Unbalanced design

Loss of robustness

Independent samples

Dependent samples

Repeated measures ANOVA

Summary

  • Regression and ANOVA are foundational tools for analyzing quantitative data.

  • Both methods rely on assumptions that must be checked using residual plots and summary statistics.

  • Effect size and statistical significance are key for interpreting results.

  • Non-parametric alternatives are available when assumptions are violated.

Additional info: Some context and definitions were expanded for clarity and completeness. Tables were reconstructed and formulas provided in LaTeX format as per instructions.

Pearson Logo

Study Prep