Skip to main content
Back

Variation and the Coefficient of Determination in Regression Analysis

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Variation and the Coefficient of Determination

Understanding the Coefficient of Determination ()

The coefficient of determination, denoted as , is a key statistical measure in regression analysis. It quantifies how much of the variation in the dependent variable (y) is explained by the variation in the independent variable (x).

  • Definition: measures the proportion of the total variation in y that is explained by the regression model using x.

  • Interpretation: An value close to 1 indicates that almost all of the variation in y is explained by x. An value close to 0 means that none of the variation is explained by x; the data is nearly uncorrelated.

Formula:

Alternatively, can be calculated as the square of the correlation coefficient ():

Explained vs. Unexplained Variation

In regression analysis, the total variation in the dependent variable (y) can be split into two components:

  • Explained Variation: The part of the variation in y that is accounted for by the regression model (i.e., by changes in x).

  • Unexplained Variation: The part of the variation in y that is not accounted for by the regression model; often due to random error or other variables not included in the model.

Example: Suppose you have data on test scores (y) versus hours studied (x). If , then 55.5% of the variation in test scores is explained by hours studied, and 44.5% is unexplained.

Application: Calculating from Data

Given a dataset, you can determine the value of the correlation coefficient () and then compute to assess the strength of the relationship between variables.

  • Step 1: Enter the data into lists (e.g., L1 and L2) on a calculator.

  • Step 2: Use the regression function to calculate the correlation coefficient ().

  • Step 3: Square the correlation coefficient to obtain .

Calculator Instructions (TI-84):

  • Enter data in L1 and L2.

  • Press STATCALCLinReg(ax+b).

  • Set Xlist: L1, Ylist: L2.

  • View output: = Correlation Coefficient, = Coefficient of Determination.

Worked Example Table

The following table illustrates how to compute the coefficient of determination from a set of data:

Hours Studied (x)

Test Score (y)

2

65

4

70

6

75

8

80

10

85

Suppose the correlation coefficient . Then:

This means 55.5% of the variation in test scores is explained by hours studied.

Additional Example: Retail Analysis

A retail analyst studies the relationship between the number of in-store promotional displays (x) and weekly sales revenue (y) at 12 store locations. The data is entered into a calculator to find the coefficient of determination.

Displays (x)

Weekly Revenue (y)

5

1440

6

1560

7

1680

8

1800

9

1920

10

2040

11

2160

12

2280

By following the calculator steps above, the analyst can determine and interpret how much of the variation in weekly revenue is explained by the number of displays.

Summary Table: Interpretation

Value

Interpretation

Close to 1

Nearly all variation in y is explained by x

Close to 0

Almost none of the variation in y is explained by x

Additional info: The coefficient of determination is a central concept in regression analysis, helping to assess the goodness-of-fit of a model and the strength of the relationship between variables.

Pearson Logo

Study Prep