BackMeasures of Central Tendency and Distribution Shapes in Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Central Tendency
Definitions
Measures of central tendency are statistical values that represent the center or typical value of a dataset. They are essential for summarizing and understanding data distributions.
Sample: Any measurement taken from a sample (denoted by Roman letters).
Population: Any measurement taken from a population (denoted by Greek letters).
Frequency Distribution Table: Organizes raw data into table form; the frequency indicates the number of occurrences of any given data point.
Example Frequency Table:
x (Data Pt) | f (frequency) |
|---|---|
0 | 1 |
1 | 2 |
2 | 3 |
(sample size)
Notation
Measure | Statistic (Sample) | Parameter (Population) |
|---|---|---|
Mean | ||
Median | MD | |
Mode | MO | |
Midrange | MR |
: Sum of all data values x
n: sample size
f: frequency
Calculating Measures of Central Tendency
Mean
The mean is the arithmetic average of a dataset and is calculated as:
Mean cannot be calculated for nominal data (qualitative data).
Median
The median is the middle value in a ranked dataset. If the dataset has an odd number of values, the median is the middle value. If even, it is the average of the two middle values.
Rank data in order (low to high).
Median position:
Mode
The mode is the value that occurs most frequently in the dataset. A dataset may have one mode (unimodal), more than one mode (multimodal), or no mode.
Unimodal: One value occurs most often.
Bimodal: Two values occur most often.
No mode: All values occur with equal frequency.
Midrange
The midrange is the average of the lowest and highest values in a dataset:
Rounding Rule
When calculating measures of central tendency, express the answer with one additional decimal place than the raw data.
Electronic Tools
For tabulated data, use frequency tables and calculators/statistical software to compute mean, median, mode, and midrange.
Calculator steps: Enter data and frequencies, then use statistical functions to compute measures.
Raw Data and Outliers
Raw Data Example
Sample data: 3.7, 4.8, 4.8, 4.0, 4.7, 5.1, 4.3
Mean:
Median:
Mode:
Midrange:
Raw Data Example with Outlier
Sample data: 3.7, 4.8, 4.8, 4.0, 4.7, 8.1, 4.3
Mean:
Median:
Outliers can cause the mean and median to be significantly different. The mean is more sensitive to outliers than the median or mode.
If the outlier is high, the mean will be higher than the median.
If the outlier is low, the mean will be lower than the median.
Trimmed Mean: To mitigate the effects of outliers, a trimmed mean can be used, where a certain percentage of the lowest and highest data values are removed before calculating the mean.
Frequency Data Examples
Frequency Table Example
Number of books read by 19 students:
x (books) | f (students) | f*x | c.f. (cumulative freq.) |
|---|---|---|---|
0 | 2 | 0 | 2 |
1 | 3 | 3 | 5 |
2 | 5 | 10 | 10 |
3 | 4 | 12 | 14 |
4 | 5 | 20 | 19 |
Mean:
Median: Median position is th value; median is 2.
Mode: 4 (most frequent value)
Midrange:
Frequency Data Example Using Classes
Endurance times for 80 students:
Class Bounds (hrs) | f (frequency) | x (midpt of class) | f*x | c.f. |
|---|---|---|---|---|
52.5 - 63.5 | 6 | 58 | 348 | 6 |
63.5 - 74.5 | 12 | 69 | 828 | 18 |
74.5 - 85.5 | 25 | 80 | 2000 | 43 |
85.5 - 96.5 | 18 | 91 | 1638 | 61 |
96.5 - 107.5 | 14 | 102 | 1428 | 75 |
107.5 - 118.5 | 5 | 113 | 565 | 80 |
Mean:
Median: Median position is th value; median is 80.
Mode: 80 (most frequent class midpoint)
Midrange:
Frequency Table for Median and Mode
x (Data Value) | f (Frequency) | c.f. (cumulative) |
|---|---|---|
8 | 112 | 112 |
17 | 102 | 214 |
27 | 197 | 411 |
37 | 520 | 931 |
43 | 186 | 1217 |
Median: ; median position is th value; median is 37.
Mode: 37 (largest frequency, 520 occurrences)
Weighted Mean
Definition and Calculation
Weighted means are used when not all values are equally represented. The formula is:
Weights (w) represent the relative importance or frequency of each value.
Weighted Mean Example: Stock Portfolio
Price, x | Stocks, w | Product, w x |
|---|---|---|
10 | 8 | 80 |
12 | 20 | 240 |
17 | 15 | 255 |
20 | 30 | 600 |
35 | 7 | 245 |
Weighted mean:
Weighted Mean Example: Camera Ratings
Category | w (weight) | Cony X (score) | Cony w x | Sanon X (score) | Sanon w x |
|---|---|---|---|---|---|
Image Quality | 0.5 | 8 | 4 | 9 | 4.5 |
Battery Life | 0.3 | 6 | 1.8 | 6 | 1.8 |
Zoom Range | 0.2 | 7 | 1.4 | 6 | 1.2 |
Cony weighted mean:
Sanon weighted mean:
Weighted Mean Example: GPA Calculation
Letter Grade | Numeric Grade (x) | Number of Classes | Credit per class (w) | Total Credits | Total Points Earned (w x) |
|---|---|---|---|---|---|
A | 4 | 1 | 4 | 4 | 16 |
B | 3 | 3 | 3 | 9 | 27 |
C | 2 | 1 | 3 | 3 | 6 |
D | 1 | 1 | 4 | 4 | 4 |
Weighted mean (GPA):
Shapes of Distributions
Symmetric Distributions
When a vertical line can be drawn through the middle of the graph and the resulting halves are approximately mirror images, the distribution is symmetric.
Mode = Mean = Median
Uniform Distribution
All data values have the same (or nearly the same) frequency. The graph is rectangular.
Skewed Distributions
Right-skewed (Positively Skewed): Disproportionately large amount of small values; long tail on right. Outliers at high values. Mean > Median > Mode.
Left-skewed (Negatively Skewed): Disproportionately large amount of large values; long tail on left. Outliers at low values. Mean < Median < Mode.
Graphical Examples
Uniform: Bar graph with equal heights.
Symmetric (Normal): Bell-shaped curve.
Positively Skewed: Tail extends to the right.
Negatively Skewed: Tail extends to the left.
Summary Table: Measures of Central Tendency
Measure | Definition | Formula | Sensitivity to Outliers |
|---|---|---|---|
Mean | Arithmetic average | High | |
Median | Middle value | Middle position in ranked data | Low |
Mode | Most frequent value | Value with highest frequency | Low |
Midrange | Average of min and max | High | |
Weighted Mean | Mean with weights | Depends on weights |
Key Points
Central tendency measures summarize the center of a dataset.
Mean is sensitive to outliers; median and mode are more robust.
Weighted mean accounts for varying importance or frequency of data points.
Distribution shape affects the relationship between mean, median, and mode.