BackMeasures of Center: Describing, Exploring, and Comparing Data
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Describing, Exploring, and Comparing Data
Introduction
This section introduces the concept of measures of center, which are values that represent the middle or center of a data set. Understanding these measures is fundamental for summarizing and interpreting data in statistics. The main measures of center discussed are the mean, median, mode, and midrange.
Measures of Center
Definition
Measure of Center: A value that indicates the center or middle of a data set.
Mean (Arithmetic Mean)
The mean is the most commonly used measure of center and is calculated by summing all data values and dividing by the number of values.
Formula:
If the data are a sample, the mean is denoted by ("x-bar").
If the data are the entire population, the mean is denoted by (Greek letter mu).
Notation:
denotes the sum of a set of data values.
represents individual data values.
is the number of data values in a sample.
is the number of data values in a population.
Sample Mean:
Population Mean:
Caution: The term "average" is not used by statisticians for the mean, as it can refer to other measures of center.
Properties of the Mean:
Uses every data value in the calculation.
Sample means from the same population tend to vary less than other measures of center.
Not resistant: A single extreme value (outlier) can significantly affect the mean.
Resistant Statistic: A statistic is resistant if extreme values (outliers) do not cause it to change much. The mean is not resistant.
Example: For the data set 50, 25, 75, 35, 50, 25, 30, 50, 45, 25, 20, the mean is:
minutes
Median
The median is the middle value of a data set when the values are arranged in order. It is a resistant measure of center.
Calculation:
If the number of data values is odd, the median is the middle value.
If the number of data values is even, the median is the mean of the two middle values.
Notation: Sometimes denoted by ("x-tilde"), , or Med.
Properties of the Median:
Resistant to outliers: Adding extreme values does not change the median much.
Does not use every data value directly.
Example (Odd Number of Values): For the data set 50, 25, 75, 35, 50, 25, 30, 50, 45, 25, 20 (sorted: 20, 25, 25, 25, 30, 35, 45, 50, 50, 50, 75), the median is 35.0 minutes.
Example (Even Number of Values): For the data set 50, 25, 75, 35, 50, 25, 30, 50, 45, 25, 20, 50 (sorted: 20, 25, 25, 25, 30, 35, 45, 50, 50, 50, 50, 75), the median is:
minutes
Mode
The mode is the value(s) that occur(s) with the greatest frequency in a data set. It can be used with both quantitative and qualitative data.
A data set can have:
No mode (no value repeats)
One mode (unimodal)
Two modes (bimodal)
More than two modes (multimodal)
Example: For the data set 35, 35, 20, 50, 95, 75, 45, 50, 30, 35, 30 (sorted: 20, 30, 30, 35, 35, 35, 45, 50, 50, 75, 95), the mode is 35 (occurs three times).
Other Examples:
Two modes: 30, 30, 50, 50, 75 (modes: 30 and 50)
No mode: 20, 30, 35, 50, 75 (no value repeats)
Midrange
The midrange is the value midway between the maximum and minimum values in a data set. It is calculated as:
Properties of the Midrange:
Very sensitive to extreme values (not resistant).
Rarely used in practice, but easy to compute.
Helps illustrate that there are multiple ways to define the center of a data set.
Sometimes confused with the median, so clear definitions are important.
Example: For the data set 50, 25, 75, 35, 50, 25, 30, 50, 45, 25, 20:
minutes
Round-Off Rules for Measures of Center
For the mean, median, and midrange, carry one more decimal place than is present in the original data values.
For the mode, leave the value as is (do not round).
Critical Thinking: When Measures of Center Are Not Meaningful
It is important to consider whether calculating a measure of center makes sense for a given data set. Some situations where the mean or median are not meaningful include:
Zip codes (just labels, not measurements)
Ranks (reflect order, not quantity)
Jersey numbers (identifiers, not measurements)
Top 5 CEO compensations (not representative of the population)
Means of means (e.g., mean ages of states do not represent the mean age of the entire country without considering population sizes)
Calculating the Mean from a Frequency Distribution
When data are summarized in a frequency distribution, the mean can be approximated by multiplying each class midpoint by its frequency, summing these products, and dividing by the total frequency:
where is the frequency and is the class midpoint.
Note: This method provides an approximation because it uses class midpoints instead of actual data values.
Weighted Mean
A weighted mean is used when different data values are assigned different weights. The formula is:
where is the weight assigned to each value .
Example: Calculating a grade-point average (GPA) where course credits are used as weights for each grade received.
Summary Table: Measures of Center
Measure | Definition | Resistant? | Example Use |
|---|---|---|---|
Mean | Sum of all values divided by number of values | No | Average test score |
Median | Middle value when data are ordered | Yes | Median household income |
Mode | Most frequently occurring value | Yes | Most common shoe size |
Midrange | Midpoint between maximum and minimum | No | Quick estimate of center |