Skip to main content
Back

Chapter 3: Describing, Exploring, and Comparing Data – Measures of Center and Variation

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Center

Introduction to Measures of Center

Measures of center are statistical values that describe the central point or typical value in a data set. The most common measures include the mean, median, mode, and midrange. Each measure provides a different perspective on the data's central tendency and is useful in various contexts.

  • Mean: The arithmetic average, calculated by summing all data values and dividing by the number of values. Sensitive to outliers and not resistant.

  • Median: The middle value when data is ordered. Resistant to outliers and does not use every data value directly.

  • Mode: The value(s) that occur most frequently. Can be used with nominal data and may be unimodal, bimodal, multimodal, or have no mode.

  • Midrange: The value halfway between the maximum and minimum values. Very sensitive to extremes and not resistant.

Resistant statistics are those that are not significantly affected by extreme values (outliers).

Notation for Measures of Center

  • \( \mu \): Population mean

  • \( \overline{x} \): Sample mean

  • \( n \): Number of data values in a sample

  • \( N \): Number of data values in a population

Calculating the Mean from a Frequency Distribution

When data is grouped into a frequency distribution, the mean is estimated using class midpoints and frequencies. This method provides an approximation of the mean.

  • Formula:

Mean from frequency distribution formula

Where f is the frequency of each class and x is the class midpoint.

Weighted Mean

The weighted mean is used when data values have different levels of importance or frequency, represented by weights. It is commonly applied in calculating grade-point averages and other scenarios where values contribute unequally.

  • Formula:

Weighted mean formula

Where w is the weight assigned to each value x.

Measures of Variation

Introduction to Measures of Variation

Measures of variation describe the spread or dispersion of data values in a set. They help quantify how much the data values differ from each other and from the center.

  • Range: Difference between the maximum and minimum values. Sensitive to outliers and not resistant.

  • Standard Deviation: Measures the average distance of data values from the mean. Denoted by s for sample and \( \sigma \) for population.

  • Variance: The square of the standard deviation. Denoted by \( s^2 \) for sample and \( \sigma^2 \) for population.

Standard Deviation of a Sample

The standard deviation quantifies the amount of variation or dispersion in a set of sample values. It is always non-negative and increases with greater variability.

  • Shortcut Formula for Sample Standard Deviation:

Sample standard deviation shortcut formula

Where n is the sample size, x are the data values.

Variance of a Sample

Variance measures the average squared deviation from the mean. It is useful for statistical inference and is always non-negative.

  • Notation: \( s^2 \) for sample variance, \( \sigma^2 \) for population variance.

  • Relationship: Standard deviation is the square root of variance.

Interpreting Standard Deviation

Range Rule of Thumb

The range rule of thumb helps identify significant values in a data set. For many distributions, most values lie within two standard deviations of the mean.

  • Significantly low values: \( \mu - 2\sigma \) or lower

  • Significantly high values: \( \mu + 2\sigma \) or higher

  • Values not significant: Between \( \mu - 2\sigma \) and \( \mu + 2\sigma \)

Range rule of thumb diagram

Empirical Rule for Bell-Shaped Distributions

The empirical rule applies to data sets with approximately normal (bell-shaped) distributions. It describes the proportion of data within certain standard deviations from the mean:

  • About 68% of values fall within 1 standard deviation of the mean.

  • About 95% of values fall within 2 standard deviations of the mean.

  • About 99.7% of values fall within 3 standard deviations of the mean.

Empirical rule bell-shaped distribution diagram

Summary Table: Measures of Center and Variation

Measure

Definition

Resistant?

Formula

Mean

Arithmetic average

No

Median

Middle value in ordered data

Yes

Middle value or average of two middle values

Mode

Most frequent value(s)

Yes

N/A

Midrange

Midpoint between max and min

No

Range

Difference between max and min

No

Standard Deviation

Average deviation from mean

No

Variance

Square of standard deviation

No

Pearson Logo

Study Prep