Skip to main content
Back

Comprehensive Review of Statistics and Regression Methods

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Review on Statistics

Summation Operator and Properties

The summation operator is a fundamental notation in statistics, used to denote the sum of a sequence of numbers. Several properties simplify calculations involving sums.

  • Summation Operator:

  • Property 1 (Constant):

  • Property 2 (Constant Multiple):

  • Property 3 (Linearity):

Sample Statistics

Sample statistics are used to summarize and describe data from a sample.

  • Sample Average (Mean):

  • Sum of Deviations:

  • Sum of Squared Deviations:

  • Sum of Multiplied Deviations:

Sample Variance and Covariance

  • Sample Variance:

  • Sample Covariance:

Statistics and Econometrics

  • Statistics: The study of methods to draw useful information from data, including descriptive statistics (summarizing data) and inferential statistics (estimators, tests, confidence intervals).

  • Econometrics: Application of statistical methods to economic data to extract meaningful information. Microeconometrics focuses on individual or firm-level data, while macroeconometrics deals with aggregate data.

Random Experiments and Random Variables

  • Random Experiment: An experiment whose outcome cannot be predicted with certainty, but all possible outcomes can be described. The experiment can be repeated under the same conditions (e.g., tossing a coin).

  • Random Variable (RV): A variable whose value is determined by a random experiment. It is a function mapping outcomes to real numbers.

  • Discrete RV: Takes at most countably infinite values.

  • Continuous RV: Takes values in an uncountably infinite set.

Probability Distributions

  • Probability Distribution: Assigns probabilities to each possible value of a random variable.

  • For discrete RVs: List of probabilities for each value.

  • For continuous RVs: Probability density function (PDF); probability at a single point is zero.

Discrete RV

Continuous RV

Range

Countable

Uncountable

Description

pmf

pdf

Probability at a point

Has mass

No mass

Expectation

Population Mean and Variance

  • Population Mean:

  • Laws of Expectation:

  • Population Variance:

  • Shortcut:

  • Laws of Variance:

Covariance and Correlation

  • Population Covariance:

  • Shortcut:

  • Laws of Covariance:

    • If and are independent, (converse not always true)

    • If or , then

  • Population Coefficient of Correlation:

Laws Regarding the Sum of Two Random Variables

Conditional Expectation

The conditional expectation of given is the expected value of in the subpopulation where .

  • Notation:

  • Conditional Probability:

Laws of Conditional Expectation

  • For any function :

  • If and are independent:

  • Law of Iterated Expectation:

  • More generally:

  • If , then ; any function of is uncorrelated with .

Review on Linear Regression

Introduction to Regression

Regression analysis studies the conditional mean function of a response variable given explanatory variables. It is widely used in economics to investigate causal relationships and to focus on the mean response of the dependent variable.

  • Regression: Analysis of , the expected value of given .

  • Application: Investigate how changes in affect , holding other variables constant.

Linear Multiple Regression Model

  • Model: for

  • Assumptions:

    • Random sampling: are i.i.d.

    • Conditional mean:

    • Nonzero finite fourth moments (no large outliers)

    • No perfect multicollinearity (no exact linear relationship among regressors)

  • Partial Effect: measures the effect of on , holding other variables constant.

  • Homoskedasticity: If does not depend on , errors are homoskedastic; otherwise, heteroskedastic.

Ordinary Least Squares (OLS) Estimator

  • OLS Estimator:

  • The OLS estimator finds the linear combination of regressors that minimizes the sum of squared residuals.

Measures of Fit

  • Standard Error of the Regression (SER): Measures the spread of around the regression line.

  • R-squared (): Fraction of variation in explained by regressors.

  • Adjusted : Adjusts for the number of regressors.

    • Can be negative; penalizes adding unnecessary regressors.

Large Sample Distribution of OLS Estimator

  • Under standard assumptions, as increases, the OLS estimators are approximately jointly normally distributed.

  • Each is approximately .

Hypothesis Testing in Regression

  • Testing a Single Coefficient: vs.

  • t-statistic:

  • Standard Error:

  • p-value: , where is the normal CDF

  • Reject if or (for 5% significance level)

Confidence Intervals and Joint Hypotheses

  • 95% Confidence Interval for :

  • Joint Hypothesis: Use robust F-tests for multiple coefficients.

Consistency and Asymptotic Normality

  • Consistency: An estimator is consistent if it converges in probability to the true parameter value as the sample size increases.

  • Asymptotic Normality: As sample size grows, the distribution of the estimator approaches a normal distribution, allowing for inference using normal-based confidence intervals and hypothesis tests.

Pearson Logo

Study Prep