Descriptive Statistics: Measures of Central Tendency

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Descriptive Statistics

Chapter Overview

Descriptive statistics is a foundational area in statistics that involves summarizing and organizing data so it can be easily understood. This chapter focuses on frequency distributions, graphical displays, and key measures such as central tendency, variation, and position.

Frequency Distributions and Their Graphs
More Graphs and Displays
Measures of Central Tendency
Measures of Variation
Measures of Position

Measures of Central Tendency

Introduction to Measures of Central Tendency

Measures of central tendency are statistical values that represent a typical or central entry of a data set. They help summarize a large set of data with a single value that is representative of the entire set.

Mean
Median
Mode

Mean

The mean, often called the average, is the sum of all data entries divided by the number of entries. It is the most commonly used measure of central tendency and is sensitive to every value in the data set.

Population Mean (): The mean of all members in a population.
Sample Mean (): The mean of a sample taken from the population.

Formulas:

Population mean:
Sample mean:

Example: Given weights (in pounds) for a sample of adults: 274, 235, 223, 268, 290, 285, 235. Sum of weights: Sample size: Sample mean: The mean weight is about 258.6 pounds.

Median

The median is the value that lies in the middle of the data when the data set is ordered. It divides the data set into two equal parts and is less affected by outliers than the mean.

If the data set has an odd number of entries: The median is the middle data entry.
If the data set has an even number of entries: The median is the mean of the two middle data entries.

Example (Odd Number): Ordered weights: 223, 235, 235, 268, 274, 285, 290. Median is the fourth entry: 268 pounds.

Example (Even Number): If the 285-pound adult is removed, ordered weights: 223, 235, 235, 268, 274, 290. Median is the mean of the third and fourth entries: pounds.

Mode

The mode is the data entry that occurs with the greatest frequency. A data set may have no mode, one mode (unimodal), or more than one mode (bimodal or multimodal).

If no entry is repeated, the data set has no mode.
If two entries occur with the same greatest frequency, both are modes (bimodal).

Example: In the weights data set: 223, 235, 235, 268, 274, 285, 290. The entry 235 occurs twice, more than any other value. The mode is 235 pounds.

Example (Categorical Data): In a survey of political party affiliation, the mode is the party with the most responses (e.g., Democrat).

Comparing Mean, Median, and Mode

Each measure of central tendency describes a typical entry of a data set, but they have different properties and are affected differently by the data's characteristics.

Mean: Uses all data values; sensitive to outliers.
Median: Not affected by outliers; represents the center of ordered data.
Mode: May not exist or may not represent a typical value, especially in continuous data.

Example: Ages in a class: 20, 20, 20, 20, 20, 21, 21, 22, 22, 22, 65 (outlier). Mean: years Median: years Mode: $20$ years The mean is influenced by the outlier (65), while the median is not. The mode may not represent a typical value if the data is skewed.

Weighted Mean

The weighted mean is used when data entries have varying weights, such as in grade point averages. It is calculated by multiplying each entry by its weight, summing these products, and dividing by the sum of the weights.

Formula:

Example: Grades and credit hours:

Grade (x)	Credit Hours (w)
A (4)	3
B (3)	4
C (2)	3
D (1)	2
F (0)	4

Sum of products: Sum of weights: Weighted mean: Grade point average is 2.5.

Mean of Grouped Data

When data is presented in a frequency distribution, the mean can be estimated using class midpoints and frequencies.

Formula:

Where is the class midpoint and is the frequency.

Steps:

Find the midpoint of each class:
Multiply each midpoint by its frequency ()
Sum all values
Sum all frequencies ()
Calculate the mean:

Example: If and , then minutes (estimated mean screen time).

Shape of Distributions

The shape of a data distribution affects which measure of central tendency best represents the data. Common shapes include:

Symmetric Distribution: The left and right halves of the graph are approximately mirror images. Mean and median are equal.
Uniform Distribution: All classes have approximately equal frequencies; the graph is rectangular.
Skewed Left (Negatively Skewed): The tail extends to the left; mean is less than the median.
Skewed Right (Positively Skewed): The tail extends to the right; mean is greater than the median.

Comparison Table:

Distribution Shape	Mean vs. Median	Tail Direction
Symmetric	Mean = Median	None
Uniform	Mean = Median	None
Skewed Left	Mean < Median	Left
Skewed Right	Mean > Median	Right

Choosing the best measure of central tendency depends on the shape and characteristics of the data set.