BackScatterplots and Correlation: Describing the Relation Between Two Variables
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Scatterplots and Correlation
Introduction to Scatterplots
Scatterplots are essential tools in statistics for visualizing the relationship between two numerical variables. Typically, one variable is considered independent (x-axis) and the other dependent (y-axis). By plotting paired data points, we can observe patterns, trends, and potential correlations between variables.
Scatterplot: A graph of paired numerical data, with each point representing a pair of values (x, y).
Independent Variable (x): The variable that is presumed to influence or predict the dependent variable.
Dependent Variable (y): The variable that is measured or observed as a response.
Correlation: The degree to which two variables move in relation to each other.
Linear Correlation: A relationship that can be well-approximated by a straight line.
Key Point: Two variables are correlated if their data points form a discernible pattern or trend on the scatterplot. A linear correlation means the trend follows a straight line.
Types of Correlation
Correlation describes both the direction and strength of the relationship between two variables.
Positive Correlation: As x increases, y also increases. The slope of the trend is positive.
Negative Correlation: As x increases, y decreases. The slope of the trend is negative.
No Correlation: No discernible pattern; changes in x do not predict changes in y.
Nonlinear Correlation: The relationship exists but does not follow a straight line.
Important Note: Correlation does NOT imply causation. Two variables may be correlated without one causing the other.
Examples of Correlation
Test Scores vs. Time Studying: Typically shows a positive correlation; more study time is associated with higher test scores.
Test Scores vs. Number of Pins on Backpack: May show no correlation; the number of pins is unlikely to affect test scores.
Test Scores vs. Time Sleeping: Could show a positive or negative correlation depending on the data.
Test Scores vs. Number of Siblings: Often shows no correlation.
Example Table: (Test Scores vs. Time Studying)
Time (min) | Score |
|---|---|
50 | 86 |
60 | 92 |
0 | 67 |
40 | 79 |
45 | 83 |
30 | 75 |
50 | 96 |
10 | 65 |
20 | 73 |
Practice: Interpreting Scatterplots
Given a table of data, plot the points on a scatterplot and determine the type of correlation. For example, plotting mean driving speed against the number of speeding tickets can reveal whether faster drivers tend to receive more tickets.
Correlation Coefficient
Definition and Interpretation
The correlation coefficient (denoted as r) is a numerical measure of the direction and strength of a linear relationship between two variables.
Range:
Direction: The sign of r indicates the direction (positive or negative correlation).
Strength: The closer |r| is to 1, the stronger the linear relationship. The closer r is to 0, the weaker the relationship.
Key Properties:
r = 1: Perfect positive linear correlation
r = -1: Perfect negative linear correlation
r = 0: No linear correlation
Important: The slope of the best-fit line does not affect the value of r.
Examples of Correlation Coefficient Values
r Value | Interpretation |
|---|---|
0.96 | Strong positive correlation |
0.59 | Moderate positive correlation |
-0.12 | Very weak negative correlation |
-0.86 | Strong negative correlation |
Calculating the Correlation Coefficient
The formula for the sample correlation coefficient is:
n = number of data pairs
, = individual data values
, = means of x and y
, = standard deviations of x and y
Using a Calculator to Find r
Most graphing calculators can compute the correlation coefficient directly. The process typically involves entering the data into lists, then using the linear regression function.
Enter data into L1 (x-values) and L2 (y-values).
Access the statistics calculation menu (e.g., STAT > CALC).
Select the linear regression function (LinReg(ax+b)).
Read the value of r from the output.

Example: Altitude vs. Speed of Sound
A scientist measures the speed of sound at different altitudes. By entering the data into a calculator and finding r, one can determine if there is a linear correlation between altitude and speed of sound.
Altitude (thousands of feet) | Speed (ft/sec) |
|---|---|
0 | 1120.2 |
5 | 1094.7 |
10 | 1076.9 |
15 | 1058.1 |
20 | 1034.5 |
25 | 1015.4 |
30 | 995.0 |
35 | 968.2 |
40 | 967.1 |
45 | 966.5 |
50 | 966.1 |
Summary Table: Correlation Types and Interpretation
Type of Correlation | Scatterplot Pattern | r Value |
|---|---|---|
Strong Positive | Points tightly clustered around an upward-sloping line | Close to +1 |
Strong Negative | Points tightly clustered around a downward-sloping line | Close to -1 |
Weak/No Correlation | Points scattered with no clear pattern | Close to 0 |
Practice Problems
Given a data set, plot the points and determine the type of correlation.
Calculate the correlation coefficient using a calculator and interpret its meaning.
Explain why correlation does not imply causation using real-world examples.
Additional info: This guide covers the core concepts of scatterplots and correlation, including calculation and interpretation of the correlation coefficient, as outlined in Chapter 4 of a typical college statistics course.