BackMeasures of Central Tendency: Mean and Median
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Central Tendency
Overview of Numerical Summary Measures
Numerical summary measures are used to describe and summarize the main features of a data set. In statistics, these measures help us understand the center and spread of a distribution, complementing graphical summaries such as histograms and boxplots. The three main characteristics to consider when analyzing a distribution are shape, center, and spread. This section focuses on methods to numerically summarize the center of a distribution.
Arithmetic Mean
The arithmetic mean is a measure of central tendency that represents the average value of a variable. It is calculated by summing all the values and dividing by the number of observations. The mean can be computed for both populations and samples, with distinct notation for each.
Population Mean (\(\mu\)): The mean calculated using all individuals in a population. It is considered a parameter.
Sample Mean (\(\overline{x}\)): The mean calculated using sample data. It is considered a statistic.
Formula for Population Mean:
Formula for Sample Mean:


Example: If a sample of song lengths is given as 210, 180, 240, 200, and 220 seconds, the sample mean is seconds.
Median
The median is another measure of central tendency. It is the value that lies in the middle of the data when arranged in ascending order. The median is represented by M and is especially useful when the data contains outliers or is skewed, as it is less affected by extreme values.
Step 1: Arrange the data in ascending order.
Step 2: Determine the number of observations, n.
Step 3: Identify the middle observation. If n is odd, the median is the middle value. If n is even, the median is the average of the two middle values.
Example (Odd Number of Observations): For the data set 180, 200, 210, 220, 240, the median is 210 (the third value).
Example (Even Number of Observations): For the data set 62, 68, 71, 74, 77, 82, 84, 88, 90, 94, the median is the average of the fifth and sixth values: .
Resistant Statistics
A statistic is said to be resistant if it is not affected by extreme values (outliers) in the data. The median is a resistant measure, while the mean is not. This distinction is important when choosing which measure to use for summarizing data.
Mean: Sensitive to outliers; can be pulled in the direction of extreme values.
Median: Resistant to outliers; remains stable even if extreme values are present.
Example: If a data set contains the values 10, 12, 14, 16, and 100, the mean is , while the median is 14. The mean is much higher due to the outlier (100).
Comparing the Mean and Median
Comparing the mean and median can provide insight into the distribution of the data. If the mean and median are close, the distribution is likely symmetric. If they differ significantly, the distribution may be skewed.
Symmetric Distribution: Mean ≈ Median
Right-Skewed Distribution: Mean > Median
Left-Skewed Distribution: Mean < Median
Application: Use both mean and median to summarize data, especially when assessing the impact of outliers or skewness.
Using Software for Calculation
Statistical software such as Excel or JMP can be used to calculate the mean and median efficiently. The process typically involves entering the data, selecting the appropriate function, and interpreting the output.
Excel: Use functions =AVERAGE() for mean and =MEDIAN() for median.
JMP: Follow the same method used to create histograms to compute summary statistics.