Skip to main content
Back

Chapter 10: Correlation and Regression – Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Correlation and Regression

Scatterplots & Correlation

Scatterplots are graphical representations of paired numerical data, where one variable is considered independent (x) and the other dependent (y). They are used to visually assess the relationship between two variables.

  • Linear Correlation: If the data points form a straight-line pattern, the variables are said to have a linear correlation.

  • Types of Correlation:

    • Positive Correlation: As x increases, y increases.

    • Negative Correlation: As x increases, y decreases.

    • No Correlation: No discernible pattern between x and y.

  • Correlation vs. Causation: Correlation does not imply causation; two variables may be related without one causing the other.

Example: Test scores vs. time spent studying often show a positive correlation, while test scores vs. number of siblings may show no correlation.

Creating Scatterplots with a Calculator

To create scatterplots using a graphing calculator (e.g., TI-84):

  • Enter data into lists (L1 for x-values, L2 for y-values).

  • Turn on STATPLOT and select the scatterplot option.

  • Adjust window settings to fit the data range.

Calculator illustration

Correlation Coefficient

Definition and Interpretation

The correlation coefficient (r) measures the strength and direction of the linear relationship between two variables.

  • Range:

  • Interpretation:

    • r close to 1: Strong positive linear correlation

    • r close to -1: Strong negative linear correlation

    • r close to 0: Weak or no linear correlation

  • The sign of r matches the direction (slope) of the trend.

Formula:

Example: If r = 0.96, there is a strong positive correlation; if r = -0.92, there is a strong negative correlation.

Finding the Correlation Coefficient with a Calculator

  • Enter data in L1 and L2.

  • Use the LinReg (ax+b) function in the CALC menu.

  • Read the value of r from the output.

Calculator illustration

Hypothesis Test for Correlation Coefficient

Testing the Significance of Correlation

A hypothesis test can determine if the observed correlation is statistically significant for the population.

  • Null Hypothesis (H0): (no correlation)

  • Alternative Hypothesis (Ha): , , or (depending on the research question)

  • Use the LinRegTTest function on a calculator to obtain the test statistic and p-value.

  • If p-value < significance level (α), reject H0.

Calculator illustration

Linear Regression Using the Least Squares Method

Least Squares Regression Line

The least squares regression line is the line that minimizes the sum of the squared vertical distances (residuals) between the observed values and the line.

  • Equation:

  • Residual:

How to Find:

  • Enter data in L1 (x) and L2 (y).

  • Use LinReg(ax+b) in the CALC menu.

  • Write down the slope (a) and intercept (b).

Calculator illustration

Predicting Values with the Regression Line

  • If correlation is strong and the x-value is within the data range, substitute x into the regression equation to predict y.

  • If correlation is weak or x is outside the data range, use the mean of y as the best estimate.

Residuals Analysis

Residual Plots

Residuals are used to assess the fit of a regression model. A residual plot displays the residuals on the vertical axis and the independent variable on the horizontal axis.

  • If residuals are randomly scattered, the linear model is appropriate.

  • If residuals show a pattern, the linear model may not be a good fit.

Formula:

Variation and the Coefficient of Determination

Coefficient of Determination (R2)

The coefficient of determination, , measures the proportion of the variance in the dependent variable that is predictable from the independent variable.

  • Formula:

  • R2 close to 1: Most variation is explained by the model.

  • R2 close to 0: Little variation is explained by the model.

Calculator illustration

Inferences for Slope of Regression Line

Hypothesis Test for Slope

To test if the slope of the regression line is significantly different from zero:

  • Null Hypothesis (H0):

  • Alternative Hypothesis (Ha): , , or

  • Use LinRegTTest on a calculator to obtain the test statistic and p-value.

Calculator illustration

Confidence Interval for Slope

A confidence interval for the slope provides a range of plausible values for the population slope.

  • Use LinRegTInt on a calculator to compute the interval.

  • If the interval does not include 0, there is evidence of a linear relationship.

Calculator illustration

Prediction Intervals

Definition and Calculation

A prediction interval estimates a range in which a single new observation is likely to fall, given a specified value of x.

  • Formula for Margin of Error (E):

  • Prediction interval:

Calculator illustration

Quadratic Regression

Quadratic Regression Model

When data shows a curved (nonlinear) pattern, a quadratic regression model may be more appropriate. The general form is:

  • Use QuadReg on a calculator to fit a quadratic model.

  • Compare R2 values for linear and quadratic models to determine the best fit.

Calculator illustration

Applications

  • Population growth, cost analysis, and other phenomena may be better modeled with quadratic regression.

Pearson Logo

Study Prep