Chapter 3: Numerically Summarizing Data – Measures of Central Tendency, Dispersion, and Position

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Central Tendency

Arithmetic Mean

The arithmetic mean is a measure of central tendency that represents the average value of a variable. It is calculated by summing all values and dividing by the number of observations. The population mean, denoted by , is computed using all individuals in a population, while the sample mean, denoted by , is computed using sample data.

Population Mean Formula:
Sample Mean Formula:

Example: Travel times (in minutes) for seven employees: 23, 36, 23, 18, 5, 26, 43.

Arithmetic mean explanation and formulas Example of computing population and sample mean

Median

The median is the value that lies in the middle of the data when arranged in ascending order. It is denoted by .

Arrange data in ascending order.
If the number of observations is odd, the median is the middle value.
If even, the median is the average of the two middle values.

Example: Median for 23, 36, 23, 18, 5, 26, 43 (odd number of observations).

Median definition and examples

Mode

The mode is the most frequent observation in a data set. A data set may have no mode, one mode, or multiple modes.

Tally the frequency of each data value.
The value with the highest frequency is the mode.

Mode definition and computation

Relation Between Mean, Median, and Distribution Shape

The relationship between the mean and median can indicate the shape of a distribution:

Distribution Shape	Mean vs Median
Skewed left	Mean substantially smaller than median
Symmetric	Mean roughly equal to median
Skewed right	Mean substantially larger than median

Relation between mean, median, and distribution shape

Measures of Dispersion

Range

The range () is the difference between the largest and smallest data values.

Formula:

Example: Range for 23, 36, 23, 18, 5, 26, 43.

Range definition and example

Standard Deviation

The standard deviation measures the spread of data values around the mean. It is the square root of the mean of squared deviations.

Population Standard Deviation:
Sample Standard Deviation:

Population standard deviation formula Sample standard deviation formula

Degrees of Freedom

Degrees of freedom refer to the number of values that are free to vary in a calculation. For sample standard deviation, is used because one value is determined by the others.

Degrees of freedom explanation

Variance

The variance is the square of the standard deviation. The population variance is and the sample variance is .

Variance definition

Empirical Rule for Bell-Shaped Data

The Empirical Rule describes the spread of data in a normal (bell-shaped) distribution:

Approximately 68% of data within 1 standard deviation of the mean
Approximately 95% within 2 standard deviations
Approximately 99.7% within 3 standard deviations

Empirical Rule explanation Empirical Rule example

Chebyshev's Inequality

Chebyshev's Inequality applies to any data set, stating that at least of observations lie within standard deviations of the mean, for .

Chebyshev's Inequality explanation and example

Grouped Data

Approximating the Mean from Grouped Data

When data is grouped, the mean can be approximated using class midpoints and frequencies:

Population Mean:
Sample Mean:

Approximating mean from grouped data

Approximating the Standard Deviation from Grouped Data

Standard deviation for grouped data is approximated using class midpoints and frequencies:

Population Standard Deviation:
Sample Standard Deviation:

Approximating standard deviation from grouped data

Weighted Mean

The weighted mean is calculated by multiplying each value by its weight, summing these products, and dividing by the sum of the weights:

Formula:

Weighted mean formula and example

Measures of Position

z-Scores

A z-score measures the distance of a data value from the mean in terms of standard deviations. It standardizes values for comparison.

Population z-score:
Sample z-score:

z-score explanation and formulas z-score comparison explanation

Quartiles

Quartiles divide data into four equal parts:

: 25th percentile
: 50th percentile (median)
: 75th percentile

Quartiles explanation and diagram Finding quartiles example

Interquartile Range (IQR)

The interquartile range (IQR) is the range of the middle 50% of observations:

Formula:

Interquartile range definition

Outliers

Outliers are extreme observations. They are identified using quartiles and the IQR:

Lower fence:
Upper fence:
Values outside these fences are considered outliers.

Checking for outliers using quartiles and IQR

Five-Number Summary and Boxplots

Five-Number Summary

The five-number summary consists of the minimum, , median (), , and maximum values.

Five-number summary definition and example

Boxplots

Boxplots visually display the five-number summary and identify outliers. Steps to construct a boxplot:

Determine lower and upper fences
Draw a box from to
Draw lines (whiskers) to minimum and maximum values within fences
Mark outliers with asterisks

Boxplot construction steps and example Boxplots and quartiles describing distribution shape

Summary Table: Distribution Shape and Measures

Boxplots and quartiles can be used to describe the shape of a distribution:

Skewed right: Median < Mean
Symmetric: Median = Mean
Skewed left: Median > Mean

Boxplots and quartiles for distribution shape