BackChapter 12: Simple Linear Regression – Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Simple Linear Regression
Introduction to Regression Analysis
Regression analysis is a fundamental statistical technique used in business to model and analyze the relationship between variables. In simple linear regression, the goal is to predict the value of a dependent variable (Y) based on the value of a single independent variable (X).
Regression analysis helps to:
Predict the value of a dependent variable from one or more independent variables.
Explain how changes in an independent variable affect the dependent variable.
Dependent variable (Y): The variable to be predicted or explained.
Independent variable (X): The variable used to predict or explain the dependent variable.
Preliminary Analysis – Scatter Plots
A scatter plot is a graphical tool used to visualize the relationship between two quantitative variables, X and Y. It is often the first step in regression analysis.
Helps identify the type of relationship (linear, curvilinear, or no relationship).
Suggests whether regression analysis is appropriate.
Types of Relationships
Linear Relationship: Data points form a straight line pattern.
Curvilinear Relationship: Data points form a curved pattern.
No Relationship: Data points are scattered with no discernible pattern.
Simple Linear Regression Model
Simple linear regression models the relationship between Y and X using a linear equation:
Only one independent variable (X).
Assumes changes in Y are linearly related to changes in X.
Population regression equation:
Where is the intercept, is the slope, and is the random error term.
Estimated regression equation (prediction line):
and are sample estimates of and .
The Least Squares Method
The least squares method determines the best-fitting regression line by minimizing the sum of squared differences between observed and predicted values.
Objective:
Regression coefficients and are calculated to achieve this minimum.
Interpretation of the Slope and Intercept
Intercept (): Estimated mean value of Y when X = 0.
Slope (): Estimated change in mean value of Y for a one-unit increase in X.
Worked Example: Home Prices and Square Footage
A real estate agent examines the relationship between house price (Y, in $1,000s) and house size (X, in square feet) using a sample of 10 houses.
House Price in $1000s (Y) | Square Feet (X) |
|---|---|
245 | 1400 |
312 | 1600 |
279 | 1700 |
308 | 1875 |
199 | 1100 |
219 | 1550 |
405 | 2350 |
324 | 2450 |
319 | 1425 |
255 | 1700 |
Scatter plot and regression analysis are performed to estimate the relationship.
Regression Output and Interpretation
Regression equation (Excel output): house price = 98.24833 + 0.10977 × (square feet)
Interpretation of : The estimated mean price when square feet = 0 (not meaningful in this context).
Interpretation of : For each additional square foot, the mean house price increases by $109.77.
Making Predictions
To predict the price of a house with 2,000 square feet:
($1,000s) = $317,850
Only predict within the range of observed X values (do not extrapolate).
Computing and
For small datasets, and can be calculated by hand:
Where:
,
Measures of Variation
Total variation in Y is partitioned into explained and unexplained components:
SST (Total Sum of Squares):
SSR (Regression Sum of Squares):
SSE (Error Sum of Squares):
Coefficient of Determination ()
The coefficient of determination measures the proportion of total variation in Y explained by X.
Ranges from 0 to 1.
indicates a perfect linear relationship.
indicates a weaker relationship.
indicates no linear relationship.
Summary Table: Key Regression Quantities
Symbol | Name | Formula | Interpretation |
|---|---|---|---|
Intercept | Estimated mean Y when X = 0 | ||
Slope | Change in mean Y per unit X | ||
Total Sum of Squares | Total variation in Y | ||
Regression Sum of Squares | Variation explained by X | ||
Error Sum of Squares | Unexplained variation | ||
Coefficient of Determination | Proportion of explained variation |
Additional info:
These notes are based on textbook slides and cover the essential concepts, formulas, and interpretation for simple linear regression in business statistics.
Further topics such as hypothesis testing for regression coefficients, confidence intervals, and residual analysis are typically included in a full chapter but are not shown in these slides.