Measures of Variability and Dispersion in Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Variability

Definition and Importance

In statistics, measures of variability (also called variation or deviation) describe the spread of data values in a data set with respect to the mean. While measures of central tendency (like the mean) indicate the center of a data set, measures of variability show how "spread out" the data is.

Variation quantifies how much the data values differ from each other and from the mean.
Common measures include range, variance, standard deviation, and coefficient of variation.

Example: Comparing Groups

Consider two groups of 5 students in a statistics class. Their final grades are:

Data Pt.	Group 1	Group 2
x1	60	85
x2	70	75
x3	80	90
x4	90	50
x5	95	100

Both groups have the same mean (x̄ = 79), but Group 2 is more "spread out" than Group 1.

Notation for Measures of Variability

Measure	Statistic (sample)	Parameter (population)
Range	R	R
Variance	s2	σ2
Standard Deviation	s	σ
Coefficient of Variation	CVar	CVar

Measures of Variance

Range

Definition: The difference between the largest and smallest values in a data set.
Formula:
Example: For Group 1: ; for Group 2:
Limitation: Only uses extreme values; does not consider all data points.

Variance

Definition: The average squared deviation of each data value from the mean.
Sample Variance Formula:
Shortcut Formula:
Population Variance Formula:
Units: Variance is measured in squared units of the data (e.g., dollars2).

Worked Example: Calculating Variance

Group 1	x	x2
	60	3600
	70	4900
	80	6400
	90	8100
	95	9025
Totals	395	32025

Sample Variance for Group 1:

Group 2	x	x2
	85	7225
	75	5625
	90	8100
	50	2500
	100	10000
Totals	395	32575

Sample Variance for Group 2:

Standard Deviation

Definition: The square root of the variance. It is measured in the same units as the data.
Formulas: (sample), (population)
Interpretation: A standard deviation of 0 means all data values are identical. The larger the standard deviation, the more spread out the data.
Example: For Group 1: ; for Group 2:

Coefficient of Variation (CVar)

Definition: The ratio of the standard deviation to the mean, expressed as a percentage. Useful for comparing variability between data sets with different units or means.
Formula:
Example: If the average score on an English final exam is 85 with a standard deviation of 5, and the average score on a history final exam is 100 with a standard deviation of 8:
- English:
- History:
History scores are more variable.

Empirical Rule (68-95-99.7 Rule)

Definition and Application

The Empirical Rule applies to data sets with a symmetric, bell-shaped (normal) distribution. It states:

Approximately 68% of the data lies within 1 standard deviation of the mean.
Approximately 95% of the data lies within 2 standard deviations of the mean.
Approximately 99.7% of the data lies within 3 standard deviations of the mean.

Values outside these ranges are considered unusual or very unusual.

Example: Heights of Men

Mean height: 69.4 inches; standard deviation: 2.9 inches.
95% of heights: inches.
68% of heights: inches.

Chebyshev's Theorem

Definition and Application

Chebyshev's Theorem applies to any data set, regardless of distribution shape. It states that the proportion of data within k standard deviations of the mean is at least for .

k		Meaning
2	0.75	At least 75% of data within 2 standard deviations of the mean
3	0.89	At least 88.9% of data within 3 standard deviations of the mean

Chebyshev's Theorem is conservative compared to the Empirical Rule.
It is especially useful when the distribution is not normal.

Example: Olympic Swim Times

Mean time: 502.84 seconds; standard deviation: 4.68 seconds.
Interval within 2 standard deviations: seconds.
Empirical Rule: 95% of data within this interval (if normal).
Chebyshev's Theorem: At least 75% of data within this interval (any distribution).

Examples Using Frequency Tables

Calculating Range, Mean, Variance, and Standard Deviation

Given a frequency table, you can compute the range, mean, variance, and standard deviation as follows:

Grade (x)	Frequency (f)	fx	fx2
0	6	0	0
1	5	5	5
2	9	18	36
3	12	36	108
4	3	12	48
Totals	35	80	230

Sample mean:
Sample variance:
Sample standard deviation:

Grouped Data Example

Class	Frequency (f)	Midpoint (xm)	fxm	fxm2
0-99	380	49.5	18810	930195
100-199	230	149.5	34385	5140558
200-299	210	249.5	52395	13072551
300-399	90	349.5	31455	11073755
400-499	70	449.5	31465	1417250
500+	20	599.5	11990	2518508
Totals	1000		192000	62532750

Sample mean:
Sample variance:
Sample standard deviation:

Calculator Instructions (TI-83/84 Plus)

To compute range, mean, variance, and standard deviation from a list:
1. Enter data: STAT > EDIT > input data in L1
2. Calculate: STAT > CALC > 1-Var Stats > ENTER
For frequency tables, enter data in L1 and frequencies in L2, then use 1-Var Stats with L1, L2.

Summary Table: Empirical Rule vs. Chebyshev's Theorem

Rule	Within 1 SD	Within 2 SD	Within 3 SD
Empirical Rule (Normal)	68%	95%	99.7%
Chebyshev's Theorem (Any)	--	≥75%	≥88.9%

Additional info: The notes also provide calculator instructions and emphasize the importance of using the correct formula for sample vs. population variance. The Empirical Rule is only valid for normal distributions, while Chebyshev's Theorem applies to all distributions.