Measures of Variation in Descriptive Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Describing, Exploring, and Comparing Data

Measures of Variation

Measures of variation are essential in statistics for understanding how data values spread or deviate from the center. This section focuses on three primary measures: range, standard deviation, and variance. Interpreting and understanding these measures is as important as computing them.

Key Concepts

Variation is the most important topic in statistics for describing how data values differ from each other.
Measures of variation help us interpret the reliability and consistency of data.

Round-off Rule for Measures of Variation

When rounding the value of a measure of variation, carry one more decimal place than is present in the original set of data.

Range

The range of a set of data values is the difference between the maximum and minimum data values.
Formula:
Important Properties:
- The range is very sensitive to extreme values (not resistant).
- It does not reflect the variation among all data values, as it only uses the two extreme values.
Example: For wait times (50, 25, 75, 35, 50, 25, 30, 50, 45, 25, 20): minutes

Standard Deviation

The standard deviation of a set of values measures how much data values deviate from the mean.
Notation:
- = sample standard deviation
- = population standard deviation
Sample Standard Deviation Formula:
Important Properties:
- Measures deviation from the mean.
- Never negative; zero only if all values are identical.
- Larger indicates greater variation.
- Can increase dramatically with outliers.
- Units are the same as the original data.
- is a biased estimator of (does not center around ).
Example Calculation:
1. Compute mean: min
2. Subtract mean from each value: 10.9, -14.1, 35.9, -4.1, 10.9, -14.1, -9.1, 10.9, 5.9, -14.1, -19.1
3. Square each deviation: 118.81, 198.81, 1288.81, 16.81, 118.81, 198.81, 82.81, 118.81, 34.81, 198.81, 364.81
4. Sum squared deviations:
5. Divide by :
6. Take square root: minutes
Shortcut Formula: Example: minutes

Range Rule of Thumb

A simple tool for interpreting standard deviation: about 95% of sample values lie within 2 standard deviations of the mean.
Identifying Significant Values:
- Significantly low: or lower
- Significantly high: or higher
- Not significant: between and
Estimating Standard Deviation:

Standard Deviation of a Population

For a population, divide by (population size) instead of .
Formula:

Variance

The variance is the square of the standard deviation.
Sample variance:
Population variance:
Important Properties:
- Units are the squares of the original data units.
- Not resistant to outliers.
- Never negative; zero only if all values are the same.
- Sample variance is an unbiased estimator of population variance .

Why Divide by (n-1)?

Only values can be assigned freely; the last value is determined by the mean.
Dividing by makes an unbiased estimator of ; dividing by would underestimate the population variance.

Empirical Rule for Data with a Bell-Shaped Distribution

For approximately bell-shaped distributions:
- About 68% of values fall within 1 standard deviation of the mean.
- About 95% within 2 standard deviations.
- About 99.7% within 3 standard deviations.
Example: IQ scores with mean 100, standard deviation 15: 2 standard deviations: to $130$ About 95% of IQ scores are between 70 and 130.

Chebyshev’s Theorem

For any data set (not necessarily bell-shaped), the proportion of values within standard deviations of the mean is at least , for .
For : at least 75% of values within 2 standard deviations.
For : at least 89% of values within 3 standard deviations.
Example: IQ scores, mean 100, standard deviation 15: At least 75% between 70 and 130; at least 89% between 55 and 145.

Comparing Variation in Different Samples or Populations

Coefficient of Variation (CV): Expresses standard deviation as a percent of the mean, allowing comparison between data sets with different units or means.
Formulas: Sample: Population:
Round CV to one decimal place (e.g., 25.3%).

Biased and Unbiased Estimators

Sample standard deviation is a biased estimator of population standard deviation .
Sample variance is an unbiased estimator of population variance .