BackCorrelation and Linear Regression: Study Notes for Business Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Correlation and Linear Regression
Scatterplots & Correlation
Scatterplots are graphical representations of paired numerical data, where one variable is considered independent (x) and the other dependent (y). They are used to visually assess the relationship between two variables.
Correlation describes the degree to which two variables move together. If the data points form a straight-line pattern, the correlation is linear.
Positive Correlation: As x increases, y increases. The slope is positive.
Negative Correlation: As x increases, y decreases. The slope is negative.
No Correlation: No discernible pattern between x and y.
Nonlinear Correlation: The relationship is not a straight line.
Correlation does not imply causation; two variables may appear related without one causing the other.
Example: Test scores vs. time spent studying often show a positive correlation, while test scores vs. number of siblings may show no correlation.
Correlation Coefficient (r)
The correlation coefficient (r) quantifies the direction and strength of a linear relationship between two variables.
Range:
r > 0: Positive correlation
r < 0: Negative correlation
r = 0: No linear correlation
Strength: Values of r close to 1 or -1 indicate strong correlation; values near 0 indicate weak correlation.
The slope of the best-fit line does not affect the value of r.
Example: If , the correlation is strong and positive; if , it is strong and negative; if , it is weak.
Calculating the Correlation Coefficient
To calculate r using a TI-84 calculator:
Turn diagnostics on (only needed once).
Enter data in L1 (x-values) and L2 (y-values).
Go to CALC > 4:LinReg(ax+b).
Read the value of r from the output.

Linear Regression Using the Least Squares Method
Linear regression models the relationship between two variables with a straight line, minimizing the sum of squared vertical distances (residuals) from the data points to the line.
Regression Equation:
Residual: (the difference between observed and predicted values)
How to Find the Regression Line on TI-84:
Enter data in L1 (x) and L2 (y).
Go to CALC > 4:LinReg(ax+b).
Write down the slope (a) and intercept (b).
Plot the regression line using the calculator.

Predicting Values with the Regression Line
Use the regression equation to predict y-values for given x-values:
If correlation is strong and the x-value is within the data range, use the regression line for prediction.
If correlation is weak or the x-value is outside the data range, use the mean of y as the best estimate.
Example: Predict ice cream sales at a given temperature using the regression equation.
Residuals Analysis
Residuals help assess the fit of a regression model:
Random residuals: Model is a good fit.
Patterned residuals: Model is not a good fit; consider a different model.
Residual Plot: A graph of residuals versus x-values to check for randomness.
Variation and the Coefficient of Determination ()
The coefficient of determination () measures the proportion of variation in y explained by x through the regression model.
close to 1: Most variation is explained by the model.
close to 0: Little variation is explained by the model.
for simple linear regression.
Formula:

Inferences for Slope of Regression Line
Hypothesis Test for Slope
To test if there is a significant linear relationship between x and y:
Null Hypothesis (): (no linear relationship)
Alternative Hypothesis (): , , or (depending on the context)
Use the LinRegTTest function on a calculator to perform the test.
Compare the p-value to the significance level () to decide whether to reject .


Confidence Interval for Slope
A confidence interval estimates the range of plausible values for the slope of the regression line.
Use the LinRegTInt function on a calculator.
If the interval does not include 0, there is evidence of a linear relationship.



Prediction Intervals
A prediction interval gives a range for a single predicted y-value at a specific x, accounting for both the regression error and the variability of the data.
Formula for margin of error (E):
Prediction interval:

Quadratic Regression
When data shows a curved (nonlinear) pattern, a quadratic regression model may be more appropriate:
Quadratic Regression Equation:
Use technology (e.g., TI-84's QuadReg function) to fit the model.
Compare values for linear and quadratic models to determine the best fit.


Additional info: The calculator images are included only when directly relevant to the step-by-step instructions for statistical calculations, as per the provided guidelines.