BackDescriptive Statistics: Measures of Central Tendency and Distribution Shapes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Central Tendency
Definition and Importance
Measures of central tendency are statistical values that represent a typical or central entry of a data set. They are essential for summarizing and understanding the general behavior of data.
Mean
Median
Mode
Mean (Average)
The mean is the sum of all data entries divided by the number of entries. It is a reliable measure because it incorporates every value in the data set.
Sigma notation: denotes the sum of all data entries .
Population mean:
Sample mean:
Example: For the sample weights 274, 235, 223, 268, 290, 285, 235, the mean is:
pounds
Median
The median is the value that lies in the middle of an ordered data set, dividing it into two equal parts.
If the number of entries is odd: the median is the middle entry.
If the number of entries is even: the median is the mean of the two middle entries.
Example: For the ordered weights 223, 235, 235, 268, 274, 285, 290 (seven entries), the median is 268 pounds. If one entry is removed (six entries), the median is pounds.
Mode
The mode is the data entry that occurs with the greatest frequency.
If no entry is repeated, there is no mode.
If two entries share the highest frequency, the data set is bimodal.
Example: For the weights 223, 235, 235, 268, 274, 285, 290, the mode is 235 pounds (occurs twice).
Example (Categorical Data):
Political Party | Frequency, f |
|---|---|
Democrat | 46 |
Republican | 34 |
Independent | 39 |
Other/don’t know | 5 |
The mode is Democrat (highest frequency).
Comparing Mean, Median, and Mode
All three measures describe a typical entry, but each has advantages and disadvantages:
Mean: Uses all data entries, but is sensitive to outliers.
Median: Not affected by outliers; best for skewed data.
Mode: Useful for categorical data; may not represent a typical value in numerical data.
Example: For the ages in a class:
Ages in a class |
|---|
20 20 20 20 21 21 21 22 22 23 23 23 24 24 65 |
Mean: years
Median: years
Mode: 20 years
The mean is influenced by the outlier (65), while the median is not. The mode does not represent a typical entry.
Weighted Mean
Definition and Calculation
The weighted mean is used when data entries have different weights. It is calculated as:
, where is the weight of each entry .
Example: Calculating grade point average:
Final Grade | Credit Hours |
|---|---|
C | 3 |
D | 4 |
A | 1 |
C | 2 |
B | 3 |
Weighted mean:
Mean of Grouped Data
Mean of a Frequency Distribution
For grouped data, the mean is approximated using class midpoints and frequencies:
, where is the midpoint and is the frequency.
Steps:
Find the midpoint of each class:
Sum the products of midpoints and frequencies:
Sum the frequencies:
Calculate the mean:
Example:
Class midpoint, x | Frequency, f | xf |
|---|---|---|
172.5 | 3 | 517.5 |
208.5 | 2 | 417.0 |
240.5 | 5 | 1202.5 |
280.5 | 6 | 1683.0 |
316.5 | 7 | 2215.5 |
352.5 | 4 | 1410.0 |
388.5 | 3 | 1165.5 |
Total | 30 | 8631.0 |
Mean:
Additional info: This is an estimate since it uses class midpoints.
The Shape of Distributions
Symmetric Distribution
A symmetric distribution has a vertical line through the center, creating mirror-image halves. Mean, median, and mode are typically equal.
Uniform Distribution (Rectangular)
In a uniform distribution, all entries or classes have equal or nearly equal frequencies. The distribution is symmetric.
Skewed Left Distribution (Negatively Skewed)
A skewed left distribution has a tail that extends more to the left. The mean is less than the median.
Skewed Right Distribution (Positively Skewed)
A skewed right distribution has a tail that extends more to the right. The mean is greater than the median.