Descriptive Statistics: Measures of Central Tendency

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Descriptive Statistics

Introduction

Descriptive statistics involves methods for organizing, displaying, and summarizing data. One of the key aspects is understanding measures that describe the center or typical value of a data set, known as measures of central tendency.

Measures of Central Tendency

Definition and Overview

A measure of central tendency is a value that represents a typical or central entry of a data set. The most common measures are the mean, median, and mode.

Mean: The arithmetic average of the data set.
Median: The middle value when the data set is ordered.
Mode: The value that occurs most frequently in the data set.

Mean

The mean is calculated by summing all data entries and dividing by the number of entries. It is commonly referred to as the average.

Population Mean: Denoted by , calculated as:

Sample Mean: Denoted by , calculated as:

Sigma Notation: means to add all the data entries in the data set.

Example: Finding a Sample Mean

Given weights (in pounds) for a sample of adults: 274, 235, 223, 268, 290, 285, 235

Sum of weights:
Number of adults:
Sample mean:

The mean weight is about 258.6 pounds.

Median

The median is the value that lies in the middle of the data when the data set is ordered. It divides the data set into two equal parts.

If the data set has an odd number of entries, the median is the middle entry.
If the data set has an even number of entries, the median is the mean of the two middle entries.

Example: Finding the Median

Ordered weights: 223, 235, 235, 268, 274, 285, 290 (7 entries)

Median is the 4th entry: 268 pounds.

If one entry (285) is removed, the ordered weights are: 223, 235, 235, 268, 274, 290 (6 entries)

Median is the mean of the 3rd and 4th entries: pounds.

Mode

The mode is the data entry that occurs with the greatest frequency. If no entry is repeated, the data set has no mode. If two entries occur with the same greatest frequency, the data set is bimodal.

Example: Finding the Mode

Ordered weights: 223, 235, 235, 268, 274, 285, 290

Entry 235 occurs twice; other entries occur once.
Mode is 235 pounds.

Example: Mode in Categorical Data

Political Party	Frequency
Democrat	Highest
Other parties	Lower

The mode is 'Democrat', as it is the most frequent response.

Comparing Mean, Median, and Mode

Each measure describes a typical entry of a data set, but they have different properties:

Mean: Takes every entry into account; sensitive to outliers.
Median: Less affected by outliers; represents the center of ordered data.
Mode: May not always represent a typical value, especially in data sets with no repeated values or multiple modes.

Example: Ages in a Class

Measure	Value
Mean	23.8 years
Median	21.5 years
Mode	20 years

The mean is influenced by an outlier (age 65), while the median is less affected. The mode exists but may not represent a typical entry.

Weighted Mean

Definition and Calculation

The weighted mean is used when data entries have varying weights. It is calculated as:

where is the data entry and is its weight.

Example: Grade Point Average

Grade	Credit Hours
A (4 points)	3
B (3 points)	3
C (2 points)	3
D (1 point)	3
F (0 points)	4

Weighted mean:

The grade point average is 2.5.

Mean of Grouped Data

Estimating the Mean from a Frequency Distribution

When data is grouped into classes, the mean can be approximated using class midpoints and frequencies:

where is the midpoint of each class and is the frequency.

Find the midpoint of each class:
Multiply each midpoint by its frequency and sum:
Sum the frequencies:
Calculate the mean:

Example: Screen Time

Total
Total
Mean screen time: minutes

This is an estimate based on class midpoints.

Shape of Distributions

Types of Distributions

The shape of a data distribution affects the relationship between mean and median.

Symmetric Distribution: The left and right halves of the graph are approximately mirror images. Mean and median are equal.
Uniform Distribution: All classes have equal or nearly equal frequencies. The distribution is symmetric and rectangular.
Skewed Left (Negatively Skewed): The tail of the graph extends more to the left. The mean is less than the median.
Skewed Right (Positively Skewed): The tail of the graph extends more to the right. The mean is greater than the median.

Distribution Type	Mean vs. Median	Description
Symmetric	Mean = Median	Halves are mirror images
Uniform	Mean = Median	All frequencies equal
Skewed Left	Mean < Median	Tail to the left
Skewed Right	Mean > Median	Tail to the right

Additional info: These concepts are foundational for further study in statistics, including inferential statistics and hypothesis testing.