Measures of Variation in Statistics: Range, Standard Deviation, and Variance

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Describing, Exploring, and Comparing Data

Measures of Variation

Measures of variation are essential in statistics for understanding how data values differ from one another. The three primary measures of variation are range, standard deviation, and variance. These measures help interpret and understand the spread and consistency of data sets.

Key Concept

Variation is the most important topic in statistics, as it quantifies the degree to which data values differ.
Focus is placed not only on calculation but also on interpretation and understanding of these measures.

Round-off Rule for Measures of Variation

When rounding the value of a measure of variation, carry one more decimal place than is present in the original set of data.

Range

The range is the simplest measure of variation, representing the difference between the largest and smallest data values.

Definition: The range of a set of data values is the difference between the maximum and minimum data values.
Formula:
Important Properties:
- Uses only the maximum and minimum values, making it sensitive to extreme values (not resistant).
- Does not account for all data values, so it may not reflect overall variation.
Example: For wait times: 50, 25, 75, 35, 50, 25, 30, 50, 45, 25, 20 minutes

Standard Deviation

The standard deviation measures how much data values deviate from the mean. It is a more comprehensive measure of variation than the range.

Notation:
- = sample standard deviation
- = population standard deviation
Sample Standard Deviation Formula:
Shortcut Formula:
Properties:
- Measures deviation from the mean.
- Never negative; zero only if all values are identical.
- Larger indicates greater variation.
- Can increase dramatically with outliers.
- Units are the same as the original data.
- Sample standard deviation is a biased estimator of population standard deviation .
Example Calculation:
1. Compute mean: min
2. Subtract mean from each value:
3. Square each deviation:
4. Sum squared deviations:
5. Divide by :
6. Take square root:
7. Final answer: minutes

Range Rule of Thumb

The range rule of thumb is a simple method for interpreting standard deviation and identifying significant values.

Approximately 95% of sample values lie within 2 standard deviations of the mean.
Significantly low values: or lower
Significantly high values: or higher
Values not significant: Between and
Estimating Standard Deviation:

Standard Deviation of a Population

For a population, the formula for standard deviation differs from that of a sample.

Formula:
Divide by population size instead of .

Variance

Variance is the square of the standard deviation and provides another measure of data spread.

Sample variance:
Population variance:
Properties:
- Units are the squares of the original units.
- Can increase with outliers (not resistant).
- Never negative; zero only if all values are identical.
- Sample variance is an unbiased estimator of population variance .

Notation Summary

= sample standard deviation
= sample variance
= population standard deviation
= population variance

Why Divide by (n-1)?

Only values can be assigned freely; the last value is determined by the mean.
Dividing by makes sample variances center around the population variance .
Dividing by tends to underestimate the population variance.

Empirical Rule for Bell-Shaped Distributions

The empirical rule applies to data sets with approximately normal (bell-shaped) distributions.

About 68% of values fall within 1 standard deviation of the mean.
About 95% of values fall within 2 standard deviations of the mean.
About 99.7% of values fall within 3 standard deviations of the mean.
Example: IQ scores with mean 100 and standard deviation 15:
- 2 standard deviations: ,
- About 95% of IQ scores are between 70 and 130.

Chebyshev’s Theorem

Chebyshev’s theorem applies to any data set, regardless of distribution shape.

At least of values lie within standard deviations of the mean, for .
For : At least 75% of values within 2 standard deviations.
For : At least 89% of values within 3 standard deviations.
Example: IQ scores with mean 100, standard deviation 15:
- At least 75% between 70 and 130.
- At least 89% between 55 and 145.

Comparing Variation in Different Samples or Populations

The coefficient of variation (CV) expresses standard deviation relative to the mean, allowing comparison across different data sets.

Sample:
Population:
Round CV to one decimal place (e.g., 25.3%).

Biased and Unbiased Estimators

Sample standard deviation is a biased estimator of population standard deviation .
Sample variance is an unbiased estimator of population variance .

Summary Table: Measures of Variation

Measure	Formula	Properties
Range		Sensitive to extremes, not resistant
Sample Standard Deviation ()		Measures spread from mean, same units as data
Population Standard Deviation ()		Measures spread from mean, same units as data
Sample Variance ()		Units squared, unbiased estimator
Population Variance ()		Units squared
Coefficient of Variation (CV)	(sample), (population)	Relative measure, percent

Additional info: Academic context and examples have been expanded for clarity and completeness.