Skip to main content
Back

Measures of Position and Outliers: Z-scores, Percentiles, Quartiles, IQR, Outliers, Five-Number Summary, and Boxplots

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Position and Outliers

Overview

This section covers essential statistical tools for describing the position of data values within a dataset and identifying unusual observations. Topics include z-scores, percentiles, quartiles, interquartile range (IQR), outliers, the five-number summary, and boxplots.

Z-scores

Definition and Calculation

The z-score represents the distance that a data value is from the mean, measured in standard deviations. It is a standardized value that allows comparison across different datasets.

  • Population z-score:

  • Sample z-score:

Interpretation

  • A z-score is a new variable with mean 0 and standard deviation 1.

  • The value of the z-score reflects the relative standing of the measurement:

    • If , then (the mean).

    • If , then (below the mean).

    • If , then (above the mean).

Example: Z-score Comparison

Imene scored 88 on an exam (, ), Akito scored 91 (, ).

  • Imene:

  • Akito:

Imene performed relatively better, being further above the mean in standard deviation units.

Empirical Rule for Z-scores

If the frequency distribution is bell-shaped (normal):

  • Approximately 68% of observations have z-scores within (-1, 1).

  • Approximately 95% within (-2, 2).

  • Approximately 99.7% within (-3, 3).

Percentiles

Definition

The kth percentile of a data set, arranged in ascending order, is the value such that of the observations fall below and fall above .

Example: Interpreting Percentiles

If a score of 600 is in the 74th percentile on the SAT Mathematics exam, it means 74% of scores are less than or equal to 600, and 26% are greater.

Quartiles

Definition

Quartiles divide data into four equal parts:

  • Q1: 25th percentile

  • Q2: 50th percentile (median)

  • Q3: 75th percentile

Example: Quartiles Calculation

Given vehicle speeds: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40

  • Median (): Mean of 7th and 8th values:

  • First quartile (): Median of first 7 values: 28

  • Third quartile (): Median of last 7 values: 38

Interpretation

  • 25% of speeds ≤ 28 mph; 75% > 28 mph

  • 50% ≤ 32.5 mph; 50% > 32.5 mph

  • 75% ≤ 38 mph; 25% > 38 mph

Interquartile Range (IQR)

Definition

The interquartile range (IQR) is the range of the middle 50% of the observations in a data set:

Example

For vehicle speeds, , :

  • mph

The middle 50% of car speeds range over 10 mph.

Effect of Outliers on Summary Statistics

Example Table

Suppose a 15th car travels at 100 mph. How does this affect summary statistics?

Without 15th car

With 15th car

Mean

32.1 mph

36.7 mph

Median

32.5 mph

33 mph

Standard deviation

6.2 mph

18.5 mph

IQR

10 mph

11 mph

Additional info: Outliers have a large effect on the mean and standard deviation, but less effect on the median and IQR.

Outliers

Definition

An outlier is an observation that is unusually large or small relative to the other values in a data set.

  • Outliers may occur by chance, measurement error, data entry error, or sampling error.

Detecting Outliers: Quartiles Method

  • Step 1: Determine and .

  • Step 2: Compute IQR:

  • Step 3: Calculate fences:

    • Lower Fence (LF):

    • Upper Fence (UF):

  • Any value less than LF or greater than UF is an outlier.

Example 1: No Outliers

  • , ,

  • LF: mph

  • UF: mph

  • No values below 13 or above 53 mph; no outliers.

Example 2: Outlier Detected

  • Data: 5, 15, 16, 20, 21, 25, 26, 27, 30, 30, 31, 32, 32, 34, 35, 38, 38, 41, 43, 77

  • , ,

  • LF:

  • UF:

  • 77 is above UF; it is an outlier.

Outliers

Usual

Outliers

LF -4.5

UF 63.5

77

Five-Number Summary

Definition

The five-number summary consists of:

  • Minimum ()

  • First quartile ()

  • Median ( or )

  • Third quartile ()

  • Maximum ()

Comments

  • The median is a resistant measure of central tendency.

  • The IQR is a resistant measure of variation.

  • Minimum and maximum describe the tails of the distribution.

Example: Credit Card Interest Rates

Institution

Rate

Pulaski Bank and Trust Company

6.5%

Rainier Pacific Savings Bank

12.0%

Wells Fargo Bank NA

14.4%

Firstbank of Colorado

14.4%

Lafayette Ambassador Bank

14.3%

Infibank

13.0%

United Bank, Inc.

13.3%

First National Bank Of The Mid-Cities

13.9%

Bank of Louisiana

9.9%

Bar Harbor Bank and Trust Company

14.5%

  • Ordered rates: 6.5%, 9.9%, 12.0%, 13.0%, 13.3%, 13.9%, 14.3%, 14.4%, 14.4%, 14.5%

  • Five-number summary: 6.5%, 12.0%, 13.6%, 14.4%, 14.5%

Boxplots

Definition and Construction

A boxplot is a graphical representation of the five-number summary, useful for visualizing the distribution and identifying outliers.

  • Step 1: Determine lower and upper inner fences:

    • Lower Fence:

    • Upper Fence:

  • Step 2: Draw a number line including min and max values. Insert vertical lines at , , and ; enclose in a box.

  • Step 3: Label the fences.

  • Step 4: Draw whiskers from to the smallest value above LF, and from to the largest value below UF.

  • Step 5: Mark outliers (values outside fences) with an asterisk (*).

Example: TV-viewing Data

  • Five-number summary: , , , ,

  • Outlier: 77 (above UF)

  • Adjacent values: (LF), (UF)

Simple Boxplot

If no outliers are present, the boxplot consists only of the box and whiskers between the minimum and maximum values.

Describing Distribution Shape with Boxplots and Quartiles

Boxplots and quartiles can be used to describe the shape of a distribution:

  • Skewed right: Median closer to , longer whisker to the right.

  • Symmetric: Median centered, whiskers of similar length.

  • Skewed left: Median closer to , longer whisker to the left.

For example, the interest rate boxplot indicates a distribution skewed left.

Pearson Logo

Study Prep