BackCorrelation in Statistics: Understanding Relationships Between Variables
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Section 7.1: Seeking Correlation
Objectives
Define linear correlation.
Use scatterplots to investigate linear correlation.
Use the Linear Correlation Coefficient to estimate the strength of correlation.
Use a table of Critical Values to determine if the correlation coefficient is significant.
Linear Correlation
Definition and Interpretation
Linear correlation describes the relationship between two quantitative variables, indicating whether increases in one variable tend to be associated with increases or decreases in another variable. It is important to note that correlation does not imply causation.
Correlation: A statistical measure that expresses the extent to which two variables are linearly related.
Causation: Indicates that changes in one variable directly cause changes in another; correlation alone does not establish causation.
Example: Smoking and cancer rates are correlated, but correlation does not prove that smoking causes cancer without further evidence.
Scatterplots
Visualizing Relationships
A scatterplot is a graphical representation of the relationship between two quantitative variables. Each point represents a pair of values.
Scatterplot: Plots data points on a Cartesian plane, with one variable on the horizontal (x) axis and the other on the vertical (y) axis.
Explanatory variable (independent variable): Plotted on the x-axis; explains or predicts changes in the response variable.
Response variable (dependent variable): Plotted on the y-axis; responds to changes in the explanatory variable.
Example: Club-head speed (x) and driving distance (y) in golf; plotting these values helps visualize their relationship.
Creating a Scatterplot
Steps for Construction
Enter the values of the explanatory variable in one list and the response variable in another (e.g., using a calculator or software).
Use graphing technology to plot the points and visualize the relationship.
Trace or move between points to analyze the data.
Types of Correlation
Classification of Relationships
Scatterplots help distinguish between different types of correlation:
Positive correlation: Both variables tend to increase (or decrease) together.
Negative correlation: One variable increases as the other decreases.
No correlation: No discernible pattern; variables are unrelated.
Linear correlation: The relationship follows a straight-line pattern.
Example: Height and weight typically show positive correlation; interest rates and car sales may show negative correlation.
Linear Correlation Coefficient (r)
Measuring Strength and Direction
The strength and direction of a linear correlation can be measured quantitatively using the correlation coefficient, denoted by r. The value of r ranges from -1 to 1.
r = 1: Perfect positive linear correlation (all points lie on a straight line with positive slope).
r = -1: Perfect negative linear correlation (all points lie on a straight line with negative slope).
r = 0: No linear correlation.
Values of r closer to 1 or -1 indicate stronger linear relationships; values near 0 indicate weak or no linear relationship.
Examples of Scatterplots and r Values
Scatterplots with r = 0.8 or r = -0.8 show strong positive or negative linear relationships, respectively.
Scatterplots with r = 0.4 or r = -0.4 show moderate relationships.
Scatterplots with r = 0 show no linear relationship.
Computing the Correlation Coefficient
Steps Using Technology
Enter the explanatory and response variable data into lists.
Use statistical software or a calculator to compute r (often found under statistical tests or regression analysis).
Example: For Old Faithful geyser data, r = 0.979 indicates a strong positive linear correlation between eruption interval and duration.
Testing the Significance of Correlation
Using Critical Values
To determine if a correlation coefficient is statistically significant, compare the computed r to a critical value from a table based on sample size.
If is greater than the critical value, the correlation is considered significant.
Example: For n = 25 data points, the critical value is 0.396. If r = 0.979, the correlation is significant.
Critical Values Table
The table below lists critical values for the Pearson correlation coefficient at various sample sizes (n):
n | Critical Value |
|---|---|
10 | 0.632 |
15 | 0.514 |
20 | 0.444 |
25 | 0.396 |
30 | 0.361 |
40 | 0.312 |
50 | 0.279 |
100 | 0.197 |
Additional info: Table continues for other sample sizes. |
Key Points and Examples
Correlation does not imply causation; observational data may show correlation without direct cause-effect relationships.
Scatterplots and correlation coefficients are essential tools for exploring relationships between variables in statistics.
Critical values help determine the statistical significance of observed correlations.
Additional info:
These notes cover the core concepts of correlation, scatterplots, and the linear correlation coefficient, which are foundational for further study in regression and statistical inference.