Skip to main content
Back

Correlation in Statistics: Understanding Relationships Between Variables

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Section 7.1: Seeking Correlation

Objectives

  • Define linear correlation.

  • Use scatterplots to investigate linear correlation.

  • Use the Linear Correlation Coefficient to estimate the strength of correlation.

  • Use a table of Critical Values to determine if the correlation coefficient is significant.

Linear Correlation

Definition and Interpretation

Linear correlation describes the relationship between two quantitative variables, indicating whether increases in one variable tend to be associated with increases or decreases in another variable. It is important to note that correlation does not imply causation.

  • Correlation: A statistical measure that expresses the extent to which two variables are linearly related.

  • Causation: Indicates that changes in one variable directly cause changes in another; correlation alone does not establish causation.

  • Example: Smoking and cancer rates are correlated, but correlation does not prove that smoking causes cancer without further evidence.

Scatterplots

Visualizing Relationships

A scatterplot is a graphical representation of the relationship between two quantitative variables. Each point represents a pair of values.

  • Scatterplot: Plots data points on a Cartesian plane, with one variable on the horizontal (x) axis and the other on the vertical (y) axis.

  • Explanatory variable (independent variable): Plotted on the x-axis; explains or predicts changes in the response variable.

  • Response variable (dependent variable): Plotted on the y-axis; responds to changes in the explanatory variable.

  • Example: Club-head speed (x) and driving distance (y) in golf; plotting these values helps visualize their relationship.

Creating a Scatterplot

Steps for Construction

  • Enter the values of the explanatory variable in one list and the response variable in another (e.g., using a calculator or software).

  • Use graphing technology to plot the points and visualize the relationship.

  • Trace or move between points to analyze the data.

Types of Correlation

Classification of Relationships

Scatterplots help distinguish between different types of correlation:

  • Positive correlation: Both variables tend to increase (or decrease) together.

  • Negative correlation: One variable increases as the other decreases.

  • No correlation: No discernible pattern; variables are unrelated.

  • Linear correlation: The relationship follows a straight-line pattern.

  • Example: Height and weight typically show positive correlation; interest rates and car sales may show negative correlation.

Linear Correlation Coefficient (r)

Measuring Strength and Direction

The strength and direction of a linear correlation can be measured quantitatively using the correlation coefficient, denoted by r. The value of r ranges from -1 to 1.

  • r = 1: Perfect positive linear correlation (all points lie on a straight line with positive slope).

  • r = -1: Perfect negative linear correlation (all points lie on a straight line with negative slope).

  • r = 0: No linear correlation.

  • Values of r closer to 1 or -1 indicate stronger linear relationships; values near 0 indicate weak or no linear relationship.

Examples of Scatterplots and r Values

  • Scatterplots with r = 0.8 or r = -0.8 show strong positive or negative linear relationships, respectively.

  • Scatterplots with r = 0.4 or r = -0.4 show moderate relationships.

  • Scatterplots with r = 0 show no linear relationship.

Computing the Correlation Coefficient

Steps Using Technology

  • Enter the explanatory and response variable data into lists.

  • Use statistical software or a calculator to compute r (often found under statistical tests or regression analysis).

  • Example: For Old Faithful geyser data, r = 0.979 indicates a strong positive linear correlation between eruption interval and duration.

Testing the Significance of Correlation

Using Critical Values

To determine if a correlation coefficient is statistically significant, compare the computed r to a critical value from a table based on sample size.

  • If is greater than the critical value, the correlation is considered significant.

  • Example: For n = 25 data points, the critical value is 0.396. If r = 0.979, the correlation is significant.

Critical Values Table

The table below lists critical values for the Pearson correlation coefficient at various sample sizes (n):

n

Critical Value

10

0.632

15

0.514

20

0.444

25

0.396

30

0.361

40

0.312

50

0.279

100

0.197

Additional info: Table continues for other sample sizes.

Key Points and Examples

  • Correlation does not imply causation; observational data may show correlation without direct cause-effect relationships.

  • Scatterplots and correlation coefficients are essential tools for exploring relationships between variables in statistics.

  • Critical values help determine the statistical significance of observed correlations.

Additional info:

  • These notes cover the core concepts of correlation, scatterplots, and the linear correlation coefficient, which are foundational for further study in regression and statistical inference.

Pearson Logo

Study Prep