Skip to main content
Back

Scatterplots and Correlation: Describing the Relation Between Two Variables

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Scatterplots and Correlation

Introduction to Scatterplots

Scatterplots are essential tools in statistics for visualizing the relationship between two numerical variables. Typically, one variable is considered independent (x-axis) and the other dependent (y-axis). By plotting paired data points, we can observe patterns, trends, and potential correlations between variables.

  • Scatterplot: A graph of paired numerical data, with each point representing a pair of values (x, y).

  • Independent Variable (x): The variable that is presumed to influence or predict the dependent variable.

  • Dependent Variable (y): The variable that is measured or observed as a response.

  • Correlation: The degree to which two variables move in relation to each other.

  • Linear Correlation: A relationship that can be well-approximated by a straight line.

Key Point: Two variables are correlated if their data points form a discernible pattern or trend on the scatterplot. A linear correlation means the trend follows a straight line.

Types of Correlation

Correlation describes both the direction and strength of the relationship between two variables.

  • Positive Correlation: As x increases, y also increases. The slope of the trend is positive.

  • Negative Correlation: As x increases, y decreases. The slope of the trend is negative.

  • No Correlation: No discernible pattern; changes in x do not predict changes in y.

  • Nonlinear Correlation: The relationship exists but does not follow a straight line.

Important Note: Correlation does NOT imply causation. Two variables may be correlated without one causing the other.

Examples of Correlation

  • Test Scores vs. Time Studying: Typically shows a positive correlation; more study time is associated with higher test scores.

  • Test Scores vs. Number of Pins on Backpack: May show no correlation; the number of pins is unlikely to affect test scores.

  • Test Scores vs. Time Sleeping: Could show a positive or negative correlation depending on the data.

  • Test Scores vs. Number of Siblings: Often shows no correlation.

Example Table: (Test Scores vs. Time Studying)

Time (min)

Score

50

86

60

92

0

67

40

79

45

83

30

75

50

96

10

65

20

73

Practice: Interpreting Scatterplots

Given a table of data, plot the points on a scatterplot and determine the type of correlation. For example, plotting mean driving speed against the number of speeding tickets can reveal whether faster drivers tend to receive more tickets.

Correlation Coefficient

Definition and Interpretation

The correlation coefficient (denoted as r) is a numerical measure of the direction and strength of a linear relationship between two variables.

  • Range:

  • Direction: The sign of r indicates the direction (positive or negative correlation).

  • Strength: The closer |r| is to 1, the stronger the linear relationship. The closer r is to 0, the weaker the relationship.

Key Properties:

  • r = 1: Perfect positive linear correlation

  • r = -1: Perfect negative linear correlation

  • r = 0: No linear correlation

Important: The slope of the best-fit line does not affect the value of r.

Examples of Correlation Coefficient Values

r Value

Interpretation

0.96

Strong positive correlation

0.59

Moderate positive correlation

-0.12

Very weak negative correlation

-0.86

Strong negative correlation

Calculating the Correlation Coefficient

The formula for the sample correlation coefficient is:

  • n = number of data pairs

  • , = individual data values

  • , = means of x and y

  • , = standard deviations of x and y

Using a Calculator to Find r

Most graphing calculators can compute the correlation coefficient directly. The process typically involves entering the data into lists, then using the linear regression function.

  1. Enter data into L1 (x-values) and L2 (y-values).

  2. Access the statistics calculation menu (e.g., STAT > CALC).

  3. Select the linear regression function (LinReg(ax+b)).

  4. Read the value of r from the output.

Calculator illustration for finding correlation coefficient

Example: Altitude vs. Speed of Sound

A scientist measures the speed of sound at different altitudes. By entering the data into a calculator and finding r, one can determine if there is a linear correlation between altitude and speed of sound.

Altitude (thousands of feet)

Speed (ft/sec)

0

1120.2

5

1094.7

10

1076.9

15

1058.1

20

1034.5

25

1015.4

30

995.0

35

968.2

40

967.1

45

966.5

50

966.1

Summary Table: Correlation Types and Interpretation

Type of Correlation

Scatterplot Pattern

r Value

Strong Positive

Points tightly clustered around an upward-sloping line

Close to +1

Strong Negative

Points tightly clustered around a downward-sloping line

Close to -1

Weak/No Correlation

Points scattered with no clear pattern

Close to 0

Practice Problems

  • Given a data set, plot the points and determine the type of correlation.

  • Calculate the correlation coefficient using a calculator and interpret its meaning.

  • Explain why correlation does not imply causation using real-world examples.

Additional info: This guide covers the core concepts of scatterplots and correlation, including calculation and interpretation of the correlation coefficient, as outlined in Chapter 4 of a typical college statistics course.

Pearson Logo

Study Prep