Skip to main content
Back

Statistics Exam Study Guide: Chapters 1, 2, 3, and 9 (Formula Sheet & Key Concepts)

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Graphical Presentation of Data

Describing Data Distributions

Statistical data is often visualized using graphs to reveal patterns, trends, and anomalies. Common graphical methods include histograms, boxplots, and scatterplots. These tools help identify the shape, center, and spread of a dataset.

  • Histogram: Shows the frequency of data within intervals (bins).

  • Boxplot: Displays the five-number summary (minimum, Q1, median, Q3, maximum) and highlights outliers.

  • Scatterplot: Used for visualizing relationships between two quantitative variables.

Example: A boxplot of exam scores can quickly show the median, spread, and any unusually low or high scores.

Numerical Measures of Data

Measures of Center

Measures of center summarize the typical value in a dataset. The most common are the mean and median.

  • Mean (Average): The sum of all values divided by the number of values.

  • Median: The middle value when data is ordered. If n is even, the median is the average of the two middle values.

Example: For the data set {2, 4, 6, 8, 10}, the mean is 6, and the median is also 6.

Measures of Spread

Spread describes how much the data varies. Common measures include the interquartile range (IQR) and standard deviation.

  • Interquartile Range (IQR): The difference between the third quartile (Q3) and the first quartile (Q1).

  • Standard Deviation (s): Measures the average distance of data points from the mean.

Example: If the sample mean is 10 and the standard deviation is 2, most data points are within 2 units of 10.

Choosing the Right Measure

  • Symmetric/Normal Distribution: Use mean and standard deviation.

  • Skewed/Outliers: Use median and IQR.

Example: For income data (often skewed), median and IQR are preferred.

Normal Distributions

Density Curves and the Normal Model

The Normal distribution is a symmetric, bell-shaped curve defined by its mean () and standard deviation (). The area under the curve represents probability, and the total area is 1.

  • Mean (): Center of the distribution.

  • Standard Deviation (): Controls the spread.

68-95-99.7 Rule: In a normal distribution:

  • 68% of data falls within 1 standard deviation of the mean.

  • 95% within 2 standard deviations.

  • 99.7% within 3 standard deviations.

Example: If and , about 68% of values are between 85 and 115.

Standardizing and Z-Scores

To compare values from different distributions, we standardize them using Z-scores. A Z-score tells how many standard deviations a value is from the mean.

  • Z-score formula:

  • Interpretation: Positive Z means above the mean; negative Z means below.

Example: If , , , then .

Using the Standard Normal Table (Table A)

Table A gives the area (probability) to the left of a Z-score.

  • To find area to the left: Look up Z-score.

  • To find area to the right:

  • To find area between two Z-scores:

  • To find Z-score from area: Find the area in the table and read the corresponding Z.

Example: For , area to the left is 0.8106; area to the right is 0.1894.

Calculator Methods (TI-84)

  • normalcdf: Finds probability between two values.

  • invNorm: Finds the value corresponding to a given percentile (area to the left).

Example: To find the probability that when , , use normalcdf(280, E99, 266, 16).

Standard Deviation and Variability

Calculating Sample Standard Deviation

The standard deviation quantifies how much data points deviate from the mean. It is sensitive to outliers and best used for symmetric distributions.

  1. Calculate the sample mean ().

  2. Subtract the mean from each data point ().

  3. Square each deviation ().

  4. Sum the squared deviations.

  5. Divide by to get variance ().

  6. Take the square root to get standard deviation ().

Formula:

Interpretation: "On average, the observations fall [value of s] units away from the mean."

Calculator Steps (TI-84)

  1. Press [STAT] → [1: Edit...]

  2. Clear existing data in L1.

  3. Enter data into L1.

  4. Press [STAT] → [CALC] → [1: 1-Var Stats].

  5. Set List to L1.

  6. Press [Calculate].

  • Output: = Sample Mean Sx = Sample Standard Deviation n = Sample Size

Example: If Sx = 3.5, interpret as: "On average, the data points are 3.5 units away from the mean."

Interpretation Cheat Sheet

  • Standard Deviation: "On average, the observations fall [Value] distance away from the mean."

  • Z-score: "The observation is [Z-value] standard deviations above/below the mean."

  • Block Design: "To reduce variation by grouping similar subjects together so the effect of the treatment can be seen more clearly."

Practice Problem Example: Normal Distribution and Z-Scores

Problem:

The length of human pregnancies is approximately normally distributed with a mean of 266 days and a standard deviation of 16 days.

  • A. What percent of pregnancies last longer than 280 days?

  • B. How short must a pregnancy be to fall in the shortest 10% of all pregnancies?

Solution:

  • A. Standardize 280: Area to left: 0.8106 (from Table A) Area to right: Answer: About 18.94% of pregnancies last longer than 280 days.

  • B. Find score for shortest 10%: Area to left: 0.10 → Z = -1.28 Answer: The shortest 10% of pregnancies last 245.52 days or less.

Summary Table: Measures of Center and Spread

Measure

Formula

Best Used For

Interpretation

Mean

Symmetric distributions

Average value

Median

Middle value

Skewed distributions

Typical value

Standard Deviation

Symmetric distributions

Average distance from mean

IQR

Skewed distributions

Spread of middle 50%

Additional info: Chapter 8 and 9 content was not provided, so only Chapters 1, 2, and 3 are covered. Block design interpretation is included for completeness, though not directly tied to the formula sheet.

Pearson Logo

Study Prep