BackDescriptive Statistics: Measures of Central Tendency
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Descriptive Statistics
Introduction
Descriptive statistics involves methods for organizing, displaying, and summarizing data. One of the key aspects is understanding measures that describe the center or typical value of a data set, known as measures of central tendency.
Measures of Central Tendency
Definition and Overview
A measure of central tendency is a value that represents a typical or central entry of a data set. The most common measures are the mean, median, and mode.
Mean: The arithmetic average of the data set.
Median: The middle value when the data set is ordered.
Mode: The value that occurs most frequently in the data set.
Mean
The mean is calculated by summing all data entries and dividing by the number of entries. It is commonly referred to as the average.
Population Mean: Denoted by , calculated as:
Sample Mean: Denoted by , calculated as:
Sigma Notation: means to add all the data entries in the data set.
Example: Finding a Sample Mean
Given weights (in pounds) for a sample of adults: 274, 235, 223, 268, 290, 285, 235
Sum of weights:
Number of adults:
Sample mean:
The mean weight is about 258.6 pounds.
Median
The median is the value that lies in the middle of the data when the data set is ordered. It divides the data set into two equal parts.
If the data set has an odd number of entries, the median is the middle entry.
If the data set has an even number of entries, the median is the mean of the two middle entries.
Example: Finding the Median
Ordered weights: 223, 235, 235, 268, 274, 285, 290 (7 entries)
Median is the 4th entry: 268 pounds.
If one entry (285) is removed, the ordered weights are: 223, 235, 235, 268, 274, 290 (6 entries)
Median is the mean of the 3rd and 4th entries: pounds.
Mode
The mode is the data entry that occurs with the greatest frequency. If no entry is repeated, the data set has no mode. If two entries occur with the same greatest frequency, the data set is bimodal.
Example: Finding the Mode
Ordered weights: 223, 235, 235, 268, 274, 285, 290
Entry 235 occurs twice; other entries occur once.
Mode is 235 pounds.
Example: Mode in Categorical Data
Political Party | Frequency |
|---|---|
Democrat | Highest |
Other parties | Lower |
The mode is 'Democrat', as it is the most frequent response.
Comparing Mean, Median, and Mode
Each measure describes a typical entry of a data set, but they have different properties:
Mean: Takes every entry into account; sensitive to outliers.
Median: Less affected by outliers; represents the center of ordered data.
Mode: May not always represent a typical value, especially in data sets with no repeated values or multiple modes.
Example: Ages in a Class
Measure | Value |
|---|---|
Mean | 23.8 years |
Median | 21.5 years |
Mode | 20 years |
The mean is influenced by an outlier (age 65), while the median is less affected. The mode exists but may not represent a typical entry.
Weighted Mean
Definition and Calculation
The weighted mean is used when data entries have varying weights. It is calculated as:
where is the data entry and is its weight.
Example: Grade Point Average
Grade | Credit Hours |
|---|---|
A (4 points) | 3 |
B (3 points) | 3 |
C (2 points) | 3 |
D (1 point) | 3 |
F (0 points) | 4 |
Weighted mean:
The grade point average is 2.5.
Mean of Grouped Data
Estimating the Mean from a Frequency Distribution
When data is grouped into classes, the mean can be approximated using class midpoints and frequencies:
where is the midpoint of each class and is the frequency.
Find the midpoint of each class:
Multiply each midpoint by its frequency and sum:
Sum the frequencies:
Calculate the mean:
Example: Screen Time
Total
Total
Mean screen time: minutes
This is an estimate based on class midpoints.
Shape of Distributions
Types of Distributions
The shape of a data distribution affects the relationship between mean and median.
Symmetric Distribution: The left and right halves of the graph are approximately mirror images. Mean and median are equal.
Uniform Distribution: All classes have equal or nearly equal frequencies. The distribution is symmetric and rectangular.
Skewed Left (Negatively Skewed): The tail of the graph extends more to the left. The mean is less than the median.
Skewed Right (Positively Skewed): The tail of the graph extends more to the right. The mean is greater than the median.
Distribution Type | Mean vs. Median | Description |
|---|---|---|
Symmetric | Mean = Median | Halves are mirror images |
Uniform | Mean = Median | All frequencies equal |
Skewed Left | Mean < Median | Tail to the left |
Skewed Right | Mean > Median | Tail to the right |
Additional info: These concepts are foundational for further study in statistics, including inferential statistics and hypothesis testing.