Skip to main content
Back

Residuals and Residual Plots in Regression Analysis

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Residual Analysis

Residuals and Residual Plots

Residual analysis is a key step in regression modeling, used to assess the goodness of fit of a linear regression model. A residual is the vertical distance between an observed data point and the predicted value from the regression line. A residual plot helps determine whether the linear model is appropriate for the data.

  • Residual: The difference between the observed value and the predicted value from the regression line.

  • Residual Plot: A scatterplot of residuals on the vertical axis and the independent variable (or predicted values) on the horizontal axis.

Formula for Residual:

  • = observed value

  • = predicted value from the regression line

Example: Ice Cream Sales vs. High Temperature

Consider the following data set, where ice cream sales (in dollars) are recorded against daily high temperature (in °F):

High Temp (°F)

Sales ($)

62

180

65

200

68

220

70

260

72

300

75

340

78

360

The regression equation for this data is:

To calculate residuals, substitute each value into the regression equation to get , then subtract from the observed .

Interpreting Residual Plots

  • If residuals are randomly scattered around zero (no pattern), the linear model is a good fit for the data.

  • If residuals show a pattern (e.g., curve, increasing or decreasing spread), the linear model is not a good fit.

Examples of Residual Plots

  • Random scatter: Indicates appropriateness of linear regression.

  • Patterned residuals: (e.g., U-shape, increasing/decreasing spread) suggest non-linearity or heteroscedasticity; linear regression may not be suitable.

Practice: Identifying Appropriate Models

Given several residual plots, the one with points randomly scattered around zero (no visible pattern) suggests that a linear regression model is appropriate.

Summary Table: Residual Plot Interpretation

Residual Plot Pattern

Model Appropriateness

Random scatter

Linear model is appropriate

Curved pattern

Linear model is not appropriate

Increasing or decreasing spread

Indicates non-constant variance; linear model may not be appropriate

Key Points

  • Residuals help diagnose the fit of a regression model.

  • Random residuals support the use of a linear model.

  • Patterns in residuals suggest the need for a different model or transformation.

Example: If a residual plot for ice cream sales vs. temperature shows random scatter, the linear model is appropriate. If the plot shows a curve, a nonlinear model may be needed.

Pearson Logo

Study Prep