Skip to main content
Back

Descriptive Statistics: Measures of Central Tendency and Variability

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 3: Calculating Descriptive Statistics

Overview

This chapter introduces the foundational concepts of descriptive statistics, focusing on how to summarize and describe data using measures of central tendency and variability. These concepts are essential for analyzing business data and making informed decisions.

Measures of Central Tendency

Introduction to Central Tendency

Central tendency refers to a single value that represents the center or typical value of a dataset. The three most common measures are the mean, median, and mode. In some cases, a weighted mean is also used.

  • Mean: The arithmetic average of all data values.

  • Median: The middle value when data are ordered.

  • Mode: The value that appears most frequently.

  • Weighted Mean: The mean where some values contribute more than others.

The Mean

The mean is the most widely used measure of central tendency. It is calculated by summing all values and dividing by the number of observations.

  • Sample Mean Formula:

  • Population Mean Formula:

  • Example: For a sample with values 6.2, 7.1, 4.8, 9.0, 3.3:

The Weighted Mean

The weighted mean is used when different data values contribute unequally to the mean. Each value is multiplied by a weight reflecting its importance or frequency.

  • Weighted Mean Formula:

  • Example Table:

# of days worked

Frequency

3

4

4

7

5

6

6

3

Weighted mean calculation:

Advantages and Disadvantages of the Mean

  • Advantages: Simple to calculate and widely understood.

  • Disadvantages: Sensitive to outliers; does not reveal the distribution of data.

Example: Both samples [999, 1000, 1001] and [0, 1000, 2000] have a mean of 1000, but their distributions are very different.

The Median

The median is the value that divides the dataset into two equal halves. It is less affected by outliers than the mean.

  • Arrange data in ascending order.

  • For odd n: Median is the middle value.

  • For even n: Median is the average of the two middle values.

Index Point Formula:

Example (odd n): For [21, 27, 27, 28, 34, 45, 50], (round up to 4), so the median is the 4th value: 28.

Example (even n): For [145, 157, 170, 182, 204, 209], , so median is average of 3rd and 4th values: .

The Mode

The mode is the value that appears most frequently in a dataset. It is especially useful for categorical data.

  • There may be no mode, one mode, or multiple modes.

  • Example (numerical): For [0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,4,4,5], the mode is 2.

  • Example (categorical): If Toyota appears most often among car models, Toyota is the mode.

Comparing Mean, Median, and Mode

Each measure has its own strengths and weaknesses. The mean is most common, but the median is preferred when outliers are present, and the mode is used for categorical data.

Measure

Advantages

Disadvantages

Mean

Easy to calculate; widely understood

Heavily affected by outliers

Median

Not affected by extreme values

Requires data to be sorted

Mode

Can be used with categorical data

May not exist or may not be unique

Additional info:

Further topics such as measures of variability, working with grouped data, and measures of association are introduced in the chapter map but not detailed in the provided slides. These would include concepts like range, variance, standard deviation, percentiles, and correlation, which are essential for a comprehensive understanding of descriptive statistics in business contexts.

Pearson Logo

Study Prep