Skip to main content
Back

Correlation Coefficient and Scatter Diagrams in Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Correlation and Scatter Diagrams

Introduction

This section covers the concept of correlation in statistics, focusing on how to compute the correlation coefficient, interpret its value, and use scatter diagrams to visualize relationships between two variables. Understanding these concepts is essential for analyzing the strength and direction of linear relationships in data sets.

Scatter Diagrams

A scatter diagram (or scatter plot) is a graphical representation of the relationship between two quantitative variables. Each point on the plot represents a pair of values from the data set.

  • Purpose: To visually assess the type and strength of the relationship between variables x and y.

  • Interpretation:

    • If points trend upward from left to right, the relationship is positive.

    • If points trend downward, the relationship is negative.

    • If points are scattered randomly, there may be no linear relationship.

  • Example: In the provided question, the correct scatter diagram (option C) best represents the data set.

Correlation Coefficient

The correlation coefficient (often denoted as r) quantifies the strength and direction of a linear relationship between two variables.

  • Definition: The correlation coefficient ranges from -1 to 1.

  • Interpretation:

    • r > 0: Positive linear relationship

    • r < 0: Negative linear relationship

    • r = 0: No linear relationship

  • Formula:

  • Example Calculation: For the given data set, the correlation coefficient is (rounded to two decimal places).

  • Interpretation of Value: Since is negative, the relationship is negative, but the value is close to zero, indicating a very weak linear relationship.

Critical Values for the Correlation Coefficient

To determine if the observed correlation is statistically significant, compare the absolute value of the correlation coefficient to a critical value from a table based on sample size (n) and significance level (usually 0.05).

  • Critical Value Table: The table lists critical values for different sample sizes. For n = 7, the critical value is approximately 0.878.

  • Decision Rule: If is greater than the critical value, the correlation is statistically significant.

  • Example: For the data set with and critical value , is not greater than the critical value, so there is no significant linear relationship between x and y.

Summary Table: Critical Values for Correlation Coefficient

The following table summarizes critical values for different sample sizes (n):

n

Critical Value

3

1.000

4

0.950

5

0.878

6

0.811

7

0.754

8

0.707

9

0.666

10

0.632

15

0.514

20

0.444

25

0.396

30

0.361

Key Points

  • Scatter diagrams help visualize relationships between variables.

  • The correlation coefficient quantifies the strength and direction of a linear relationship.

  • Compare the absolute value of the correlation coefficient to the critical value to assess statistical significance.

  • A correlation coefficient close to zero indicates little to no linear relationship.

Example Application

  • Given a data set with x-values: 7, 9, 6, 7, 9 and corresponding y-values, plot the points on a scatter diagram.

  • Calculate the correlation coefficient using the formula above.

  • Compare to the critical value for n = 5 () to determine significance.

Additional info: The critical value table is typically used for hypothesis testing of the correlation coefficient at a significance level of 0.05. The example provided demonstrates the process for a small data set, which is common in introductory statistics courses.

Pearson Logo

Study Prep