BackMeasures of Variability and Dispersion in Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Variability
Definition and Importance
In statistics, measures of variability (also called variation or deviation) describe the spread of data values in a data set with respect to the mean. While measures of central tendency (like the mean) indicate the center of a data set, measures of variability show how "spread out" the data is.
Variation quantifies how much the data values differ from each other and from the mean.
Common measures include range, variance, standard deviation, and coefficient of variation.
Example: Comparing Groups
Consider two groups of 5 students in a statistics class. Their final grades are:
Data Pt. | Group 1 | Group 2 |
|---|---|---|
x1 | 60 | 85 |
x2 | 70 | 75 |
x3 | 80 | 90 |
x4 | 90 | 50 |
x5 | 95 | 100 |
Both groups have the same mean (x̄ = 79), but Group 2 is more "spread out" than Group 1.
Notation for Measures of Variability
Measure | Statistic (sample) | Parameter (population) |
|---|---|---|
Range | R | R |
Variance | s2 | σ2 |
Standard Deviation | s | σ |
Coefficient of Variation | CVar | CVar |
Measures of Variance
Range
Definition: The difference between the largest and smallest values in a data set.
Formula:
Example: For Group 1: ; for Group 2:
Limitation: Only uses extreme values; does not consider all data points.
Variance
Definition: The average squared deviation of each data value from the mean.
Sample Variance Formula:
Shortcut Formula:
Population Variance Formula:
Units: Variance is measured in squared units of the data (e.g., dollars2).
Worked Example: Calculating Variance
Group 1 | x | x2 |
|---|---|---|
60 | 3600 | |
70 | 4900 | |
80 | 6400 | |
90 | 8100 | |
95 | 9025 | |
Totals | 395 | 32025 |
Sample Variance for Group 1:
Group 2 | x | x2 |
|---|---|---|
85 | 7225 | |
75 | 5625 | |
90 | 8100 | |
50 | 2500 | |
100 | 10000 | |
Totals | 395 | 32575 |
Sample Variance for Group 2:
Standard Deviation
Definition: The square root of the variance. It is measured in the same units as the data.
Formulas: (sample), (population)
Interpretation: A standard deviation of 0 means all data values are identical. The larger the standard deviation, the more spread out the data.
Example: For Group 1: ; for Group 2:
Coefficient of Variation (CVar)
Definition: The ratio of the standard deviation to the mean, expressed as a percentage. Useful for comparing variability between data sets with different units or means.
Formula:
Example: If the average score on an English final exam is 85 with a standard deviation of 5, and the average score on a history final exam is 100 with a standard deviation of 8:
English:
History:
History scores are more variable.
Empirical Rule (68-95-99.7 Rule)
Definition and Application
The Empirical Rule applies to data sets with a symmetric, bell-shaped (normal) distribution. It states:
Approximately 68% of the data lies within 1 standard deviation of the mean.
Approximately 95% of the data lies within 2 standard deviations of the mean.
Approximately 99.7% of the data lies within 3 standard deviations of the mean.
Values outside these ranges are considered unusual or very unusual.
Example: Heights of Men
Mean height: 69.4 inches; standard deviation: 2.9 inches.
95% of heights: inches.
68% of heights: inches.
Chebyshev's Theorem
Definition and Application
Chebyshev's Theorem applies to any data set, regardless of distribution shape. It states that the proportion of data within k standard deviations of the mean is at least for .
k | Meaning | |
|---|---|---|
2 | 0.75 | At least 75% of data within 2 standard deviations of the mean |
3 | 0.89 | At least 88.9% of data within 3 standard deviations of the mean |
Chebyshev's Theorem is conservative compared to the Empirical Rule.
It is especially useful when the distribution is not normal.
Example: Olympic Swim Times
Mean time: 502.84 seconds; standard deviation: 4.68 seconds.
Interval within 2 standard deviations: seconds.
Empirical Rule: 95% of data within this interval (if normal).
Chebyshev's Theorem: At least 75% of data within this interval (any distribution).
Examples Using Frequency Tables
Calculating Range, Mean, Variance, and Standard Deviation
Given a frequency table, you can compute the range, mean, variance, and standard deviation as follows:
Grade (x) | Frequency (f) | fx | fx2 |
|---|---|---|---|
0 | 6 | 0 | 0 |
1 | 5 | 5 | 5 |
2 | 9 | 18 | 36 |
3 | 12 | 36 | 108 |
4 | 3 | 12 | 48 |
Totals | 35 | 80 | 230 |
Sample mean:
Sample variance:
Sample standard deviation:
Grouped Data Example
Class | Frequency (f) | Midpoint (xm) | fxm | fxm2 |
|---|---|---|---|---|
0-99 | 380 | 49.5 | 18810 | 930195 |
100-199 | 230 | 149.5 | 34385 | 5140558 |
200-299 | 210 | 249.5 | 52395 | 13072551 |
300-399 | 90 | 349.5 | 31455 | 11073755 |
400-499 | 70 | 449.5 | 31465 | 1417250 |
500+ | 20 | 599.5 | 11990 | 2518508 |
Totals | 1000 | 192000 | 62532750 |
Sample mean:
Sample variance:
Sample standard deviation:
Calculator Instructions (TI-83/84 Plus)
To compute range, mean, variance, and standard deviation from a list:
Enter data: STAT > EDIT > input data in L1
Calculate: STAT > CALC > 1-Var Stats > ENTER
For frequency tables, enter data in L1 and frequencies in L2, then use 1-Var Stats with L1, L2.
Summary Table: Empirical Rule vs. Chebyshev's Theorem
Rule | Within 1 SD | Within 2 SD | Within 3 SD |
|---|---|---|---|
Empirical Rule (Normal) | 68% | 95% | 99.7% |
Chebyshev's Theorem (Any) | -- | ≥75% | ≥88.9% |
Additional info: The notes also provide calculator instructions and emphasize the importance of using the correct formula for sample vs. population variance. The Empirical Rule is only valid for normal distributions, while Chebyshev's Theorem applies to all distributions.