Skip to main content
Back

Comprehensive Study Guide for Introductory Statistics Final Exam

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Data

Populations, Samples, Parameters, and Statistics

Understanding the basic elements of a statistical study is essential for interpreting results and designing experiments.

  • Population: The entire group of individuals or objects of interest.

  • Sample: A subset of the population selected for study.

  • Parameter: A numerical summary describing a characteristic of the population.

  • Statistic: A numerical summary describing a characteristic of the sample.

  • Observational Units: The individual entities on which measurements are taken.

  • Bias: Systematic error introduced by the sampling method, leading to non-representative results.

Picturing Variation with Graphs

Observational Studies vs. Experimental Design

Statistical studies can be classified as observational or experimental, each with distinct features and purposes.

  • Observational Study: Researchers observe subjects without intervention.

  • Experimental Study: Researchers manipulate variables to observe effects.

  • Key Components of Experimental Design:

    • Treatment: The intervention or condition applied.

    • Factor of Interest (Explanatory Variable): The variable manipulated by the researcher.

    • Response Variable (Dependent Variable): The outcome measured.

    • Nuisance Factors: Variables that may affect the response but are not of primary interest.

    • Random Assignment: Allocating subjects to treatments randomly to reduce bias.

    • Replication: Repeating the experiment to ensure reliability.

Numerical Summaries of Center and Variation

Describing Distributions

Statistical distributions are characterized by their shape, center, spread, and presence of outliers.

  • Shape: Symmetric, skewed, unimodal, bimodal, etc.

  • Center: Mean, median, mode.

  • Spread: Range, interquartile range (IQR), standard deviation.

  • Outliers: Extreme values that differ significantly from other observations.

Measures of Center

  • Mean: Arithmetic average.

  • Median: Middle value when data are ordered.

  • Mode: Most frequently occurring value.

Connection Between Distribution and Mean/Median

  • In symmetric distributions, mean and median are similar.

  • In skewed distributions, mean is pulled toward the tail.

Picturing Variation with Boxplots

Boxplots and Five-Number Summary

Boxplots visually summarize data using five key statistics.

  • Five-Number Summary: Minimum, Q1, Median, Q3, Maximum.

  • Interquartile Range (IQR):

  • Outliers: Values below or above

  • Standard Deviation: Measures average distance from the mean.

Table of statistical formulas including z-score, t-score, standard deviation, standard error, and formulas for means and proportions

Regression Analysis: Exploring Associations between Variables

Bivariate Data and Correlation

Regression analysis explores relationships between two quantitative variables.

  • Trend: Direction and strength of association.

  • Correlation Coefficient (r): Measures linear association, ranges from -1 to 1.

  • Linear Regression Model: where is the slope and is the y-intercept.

  • Prediction: Using the line of best fit to estimate values.

Modeling Variation with Probability

Probability and Events

Probability quantifies the likelihood of events.

  • AND, OR, NOT:

    • AND: Intersection of events.

    • OR: Union of events.

    • NOT: Complement of an event.

  • Conditional Probability: Probability of one event given another.

  • Mutually Exclusive: Events that cannot occur together.

  • Independence: Occurrence of one event does not affect the other.

Modeling Random Events: The Normal and Binomial Models

Normal Distribution

The normal distribution is a symmetric, bell-shaped curve used to model many natural phenomena.

  • Standard Normal Curve: Mean = 0, SD = 1.

  • Z-score:

  • Empirical Rule: 68%-95%-99.7% of data within 1-2-3 SDs.

  • Percentiles: Indicate relative standing in the distribution.

Binomial Distribution

  • Conditions: Fixed number of trials, two outcomes, constant probability, independent trials.

  • Probability Model:

Survey Sampling and Inference

Sampling Distributions and Confidence Intervals

Sampling distributions describe the variability of sample statistics.

  • Mean and Standard Deviation of Sample Proportion: ,

  • Confidence Interval for Proportion:

  • Sample Size for Desired Margin of Error:

Hypothesis Testing for Population Proportions

One-Sample Z-Test for Proportions

Hypothesis testing evaluates claims about population proportions.

  • Null Hypothesis (): No effect or difference.

  • Alternative Hypothesis (): Presence of effect or difference.

  • Type I Error: Rejecting when it is true.

  • Type II Error: Failing to reject when it is false.

Inferring Population Means

One-Sample and Two-Sample t-Tests

t-tests are used to infer population means from sample data.

  • One-Sample t-Test:

  • Two-Sample t-Test: Compares means of two independent groups.

  • Confidence Interval for Mean:

  • Assumptions: Random sampling, normality, independence.

Dependent vs. Independent Samples

  • Dependent Samples: Paired or matched observations.

  • Independent Samples: Separate groups.

Associations between Categorical Variables

Bar Graphs and Categorical Data

Bar graphs are used to compare frequencies of categorical variables.

  • Stacked Bar Graphs: Show proportions within groups.

  • Side-by-Side Bar Graphs: Compare groups directly.

  • Interpretation: Differences in bar heights indicate associations.

Multiple Comparisons and Analysis of Variance

Comparing Means Across Groups

Analysis of variance (ANOVA) is used to compare means across multiple groups.

  • Replication: Ensures reliability of results.

  • Random Assignment: Reduces bias.

Inference for Regression

Regression and Prediction

Regression inference allows for estimation and hypothesis testing about relationships between variables.

  • Estimated Slope and Intercept: Used for prediction.

  • Standard Error: Quantifies uncertainty in estimates.

Key Statistical Formulas

Summary Table of Formulas

The following table summarizes essential formulas for statistics:

Formula

Description

Z-score or t-score for a value

Sample standard deviation

Degrees of freedom

Sample size for desired margin of error

Margin of error

Standard error of the mean

Standard error of a proportion

Standard error of the difference of means

Standard error of the difference of proportions

Additional info:

  • Links to online tools for statistical analysis are provided for further exploration and practice.

Pearson Logo

Study Prep