BackMeasures of Central Tendency and the Shape of Distributions
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Central Tendency
Definition and Importance
Measures of central tendency are values that represent a typical or central entry of a data set. They are essential for summarizing and describing the main features of a collection of data.
Mean
Median
Mode
Mean (Average)
The mean is the sum of all the data entries divided by the number of entries.
Sigma notation: denotes the sum of all data entries in the data set.
Population mean:
Sample mean:
Example: For the weights 274, 235, 223, 268, 290, 285, 235:
pounds
Median
The median is the value that lies in the middle of the data when the data set is ordered.
It divides the data set into two equal parts.
If the data set has an odd number of entries: the median is the middle entry.
If the data set has an even number of entries: the median is the mean of the two middle entries.
Example: For the ordered weights 223, 235, 235, 268, 274, 285, 290, the median is 268 pounds.
Example (even entries): For 223, 235, 235, 268, 274, 290, the median is pounds.
Mode
The mode is the data entry that occurs with the greatest frequency.
Unimodal: One entry occurs most frequently.
Bimodal: Two entries occur with the same greatest frequency.
Multimodal: Three or more entries occur with the same greatest frequency.
If no entry is repeated or more than three entries occur with the same greatest frequency, there is no mode.
Example: For the weights 223, 235, 235, 268, 274, 285, 290, the mode is 235 pounds.
Political Party | Frequency, f |
|---|---|
Democrat | 46 |
Republican | 34 |
Independent | 39 |
Other/don't know | 5 |
The mode is Democrat (highest frequency).
Comparing the Mean, Median, and Mode
All three measures describe a typical entry of a data set.
Advantage of the mean: It uses every entry in the data set, making it reliable.
Disadvantage of the mean: It is greatly affected by outliers (values far removed from the rest).
Example: Ages in a class: Mean = 23.8, Median = 20.5, Mode = 20. The outlier (65) skews the mean.
Graphical comparison can help determine which measure best represents the data. In the example, the median best describes the data set.
When to Use Mean vs. Median
Use the mean: When data are fairly symmetric and have no outliers.
Use the median: When data are skewed or contain outliers.
Weighted Mean
The mean of a data set whose entries have varying weights.
Formula: , where is the weight of each entry .
Example: Mark's GPA calculation:
Points, x | Credit hours, w | xw |
|---|---|---|
2 | 3 | 6 |
2 | 4 | 8 |
1 | 1 | 1 |
4 | 3 | 12 |
2 | 2 | 4 |
3 | 3 | 9 |
Total | 16 | 40 |
Mean of Grouped Data
For a frequency distribution, the mean is approximated by:
, where is the class midpoint and is the frequency.
Steps:
Find the midpoint of each class:
Find the sum of the products of the midpoints and the frequencies:
Find the sum of the frequencies:
Find the mean:
Example: Screen time for 30 adults:
Class | Frequency, f |
|---|---|
155-190 | 3 |
191-226 | 2 |
227-262 | 5 |
263-298 | 6 |
299-334 | 7 |
335-370 | 4 |
371-406 | 3 |
Estimated mean: minutes
Additional info: This is an estimate because it uses class midpoints, not the original data.
The Shape of Distributions
Symmetric Distribution
A vertical line can be drawn through the middle of the graph, and the resulting halves are approximately mirror images.
Uniform Distribution (Rectangular)
All entries or classes have equal or approximately equal frequencies.
Symmetric.
Normal Distribution
A distribution that is both symmetrical and unimodal.
The mean, median, and mode are equal.
Skewed Left Distribution (Negatively Skewed)
The "tail" of the graph elongates more to the left.
The mean is to the left of the median:
Skewed Right Distribution (Positively Skewed)
The "tail" of the graph elongates more to the right.
The mean is to the right of the median: