Measures of Central Tendency and Dispersion (Sections 3.1 & 3.2)

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Central Tendency

Arithmetic Mean

The arithmetic mean is commonly referred to as the average and is a fundamental measure of central tendency for quantitative data.

Definition: The mean is calculated by summing all values in a data set and dividing by the number of observations.
Population Mean (m): Uses all individuals in a population.
Sample Mean (): Uses data from a sample.

Formula:

Population mean:
Sample mean:

Example: For the data set [7, 4, 1, 4, 3, 48, 5, 3, 6], the mean is calculated by adding all values and dividing by the total number of calls.

Median

The median is the middle value in an ordered data set and is another important measure of central tendency for quantitative variables.

Definition: The median splits the data into two equal halves.
Data must be ordered from smallest to largest before finding the median.
Odd number of data points: The median is the value at position .
Even number of data points: The median is the average of the two middle values.

Example (Odd): For [1, 3, 3, 4, 6, 6, 6, 7, 8], the median is the 5th value, which is 6.

Example (Even): For [3, 5, 6, 8, 10, 11, 11, 12], the median is th position, so average of 8 and 10: .

Mode

The mode is the value that appears most frequently in a data set and can be used for both quantitative and qualitative variables.

Definition: The mode is the most frequent observation.
There may be no mode, one mode, or multiple modes in a data set.

Example: For [0, 0, 1, 2, 1, 1, 2, 3, 4, 4, 0, 0], the mode is 0.

Example (Qualitative): For [head, head, shoulder, neck, head], the mode is 'head'.

Comparing Mean, Median, and Mode

Mean: Center of gravity; best for symmetric quantitative data.
Median: Splits data into halves; best for highly skewed quantitative data.
Mode: Most frequent value; useful for qualitative data.

Resistant Measures

Definition of Resistance

A measure is resistant if it is not substantially affected by extreme values (outliers).

The mean is not resistant; it is pulled in the direction of outliers.
The median is resistant; it is less affected by extreme values.

Visual Comparison: In left-skewed or right-skewed distributions, the mean is pulled toward the tail, while the median remains closer to the center.

Measures of Dispersion

Range

The range measures the spread of the data by subtracting the smallest value from the largest value.

Formula:
Only uses two values; not resistant to outliers.

Example: For [6, 1, 2, 6, 11, 7, 3, 3], range is .

Non-resistance Example: If 6 is mistakenly recorded as 6000, range becomes .

Standard Deviation

The standard deviation quantifies the average distance of each data point from the mean, providing a measure of data spread.

Formula (Sample):
Based on the mean; not resistant to outliers.
Calculation is time-consuming by hand; calculators or software are recommended.

Example: Calculate the standard deviation for Yolanda's phone call lengths. If an outlier is changed (e.g., 48 to 5 or 148), the standard deviation changes significantly.

Summary Table

Measure	Definition	Resistant?	Best For
Mean	Average value	No	Symmetric quantitative data
Median	Middle value	Yes	Skewed quantitative data
Mode	Most frequent value	Yes	Qualitative data
Range	Max minus Min	No	Quick spread estimate
Standard Deviation	Average deviation from the mean	No	Quantitative data spread

Applications and Examples

Yolanda's Cell Phone Call Lengths

Given a sample of call lengths, students are asked to:

Calculate the mean, median, and mode.
Assess which measure best describes the typical call length, especially when outliers are present.
Calculate the range and standard deviation, and observe how these measures change when an outlier is modified.

Row	Call Lengths
1	7, 4, 1
2	4, 3, 48
3	5, 3, 6

Additional info: Students should use calculators or statistical software to compute standard deviation for larger data sets.