Chapter 2.4: Measures of Variation – Descriptive Statistics Study Guide

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Variation

Introduction

Measures of variation are essential in statistics for describing how data values are spread or dispersed around the center (mean or median). Understanding variation helps us interpret the consistency, reliability, and risk associated with data sets.

Range

The range is the simplest measure of variation, calculated as the difference between the maximum and minimum values in a data set.

Definition: Range = Maximum value − Minimum value
Key Point: The range is sensitive to extreme values (outliers) and does not reflect the distribution of all data points.
Example: Two corporations each hired 10 graduates with identical mean and median salaries, but Corporation A had a range of $10,000, while Corporation B had a range of $35,000, illustrating significant differences in spread despite similar centers.

Standard Deviation

The standard deviation quantifies the average distance of data values from the mean, providing a more comprehensive measure of spread than the range.

Definition: Standard deviation describes how much values typically differ from the mean.
Interpretation: A small standard deviation indicates values are close to the mean (consistent), while a large standard deviation indicates greater variability.
Formula (Sample):

Formula (Population):

Example: Recovery times for concussed football players were analyzed, and the sample standard deviation was found to be about 2.2 days.

StatCrunch standard deviation calculation

Interpreting Standard Deviation

Standard deviation helps identify the usual range of values in a data set. Values outside this range may be considered unusual and could indicate risk or concern.

Usual Range: Typically, values within two standard deviations of the mean are considered usual.

Usual range around the mean

Empirical Rule (68–95–99.7 Rule)

The Empirical Rule applies to data sets that are approximately normal (symmetrical and unimodal). It describes the percentage of data within 1, 2, and 3 standard deviations of the mean.

68%: Within 1 standard deviation
95%: Within 2 standard deviations
99.7%: Within 3 standard deviations

Empirical Rule for normal distribution

Example: For heights of women aged 20–29 in the U.S., about 47.72% are between 58.9 and 64.1 inches tall, illustrating the Empirical Rule.

Heights of women in the U.S. and Empirical Rule

Standard Deviation for Grouped Data

When data are grouped into classes (frequency distributions), the sample mean and standard deviation can be estimated using class midpoints.

Procedure: Use the midpoint of each class and the frequency to estimate statistics.
Example: A survey of U.S. families on number of children uses grouped data to estimate standard deviation.

StatCrunch grouped data standard deviation calculation

Example: The table below shows price ranges and frequencies for homes listed for sale in a U.S. city.

Class (Price Range)	Frequency (f)
150 – 199	8
200 – 249	12
250 – 299	7
300 – 349	6
350 – 399	3

Frequency table for home prices StatCrunch grouped data calculation for home prices

Summarizing Data: Center and Spread

Both center and spread are needed to fully describe a data set. The choice of summary statistics depends on the shape of the distribution:

Symmetric, unimodal data: Use mean and standard deviation
Skewed data: Use median and interquartile range (IQR)

Center: Mean or median Spread: Standard deviation or IQR

Coefficient of Variation (CV)

The coefficient of variation (CV) expresses the standard deviation as a percentage of the mean, allowing comparison of variability between data sets with different units or scales.

Formula (Population):

Formula (Sample):

Key Point: CV is useful for comparing relative variation, especially when units or scales differ.
Example: Heights and weights of a basketball team are compared using CV.

Heights and weights data table

Statistic	Heights	Weights
Mean (μ)	72.8 inches	187.8 pounds
Standard Deviation (σ)	3.3 inches	17.7 pounds
Coefficient of Variation (CV)	4.5%	9.4%

Summary statistics for heights and weights

Interpretation: Although the standard deviation for weights is numerically larger, the CV shows weights have greater relative variation than heights.

Conclusion

Measures of variation, including range, standard deviation, and coefficient of variation, are fundamental for understanding the spread and consistency of data. Proper interpretation of these statistics is crucial for making informed decisions based on data analysis.