BackMeasures of Relative Standing and Boxplots: Chapter 3 Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Describing, Exploring, and Comparing Data
Measures of Relative Standing and Boxplots
This section covers statistical methods for describing the position of data values within a data set. Key concepts include z scores, percentiles, quartiles, the 5-number summary, and boxplots. These tools help compare data values, identify outliers, and visualize data distribution.
z Scores
Definition and Calculation
A z score (also called standard score or standardized value) indicates how many standard deviations a data value x is above or below the mean. It allows for comparison across different data sets.
Sample z score:
Population z score:
Round-off Rule: Round z scores to two decimal places (e.g., 2.31).
Properties of z Scores
A z score is the number of standard deviations a value is above or below the mean.
z scores have no units of measurement.
A value is significantly low if ; significantly high if .
If a value is less than the mean, its z score is negative.
Example: Comparing Data Values
Given two data values from different sets:
99°F body temperature (mean = 98.20°F, s = 0.62°F):
5.7790 g quarter (mean = 5.63930 g, s = 0.06194 g):
The quarter's weight is more extreme (farther above the mean) than the body temperature.
Using z Scores to Identify Significant Values
Significantly low:
Significantly high:
Not significant:
Example: Earthquake Magnitude
Given mean = 2.572, s = 0.651, magnitude = 4.01:
Since , the magnitude is significantly high.
Percentiles
Definition
Percentiles are measures of location, denoted , dividing data into 100 groups with about 1% of values in each group.
Finding the Percentile of a Data Value
To find the percentile for a value x:
Example: Percentile Calculation
For a wait time of 45 minutes among 50 sorted values, with 36 values less than 45:
Interpretation: 45 minutes is the 72nd percentile ().
Notation
n: total number of values
k: percentile (e.g., for 25th percentile)
L: locator for position in sorted list ( means 12th value)
: kth percentile
Converting a Percentile to a Data Value
Compute
If is not a whole number, round up to next whole number
The th value in the sorted list is
Example: 25th Percentile
,
(round up to 13)
13th value is 25 minutes ()
Quartiles
Definition
Quartiles are measures of location, denoted , , , dividing data into four groups with about 25% of values in each group.
Descriptions of Quartiles
(First quartile): Same as ; separates bottom 25% from top 75%.
(Second quartile): Same as and the median; separates bottom 50% from top 50%.
(Third quartile): Same as ; separates bottom 75% from top 25%.
Caution: Procedures for finding percentiles and quartiles may vary between technologies.
Statistics Defined Using Quartiles and Percentiles
Interquartile range (IQR):
Semi-interquartile range:
Midquartile:
10–90 percentile range:
5-Number Summary
Definition
The 5-number summary for a data set consists of:
Minimum
First quartile ()
Second quartile (, median)
Third quartile ()
Maximum
Example: Finding a 5-Number Summary
Minimum | Q1 | Median (Q2) | Q3 | Maximum |
|---|---|---|---|---|
10 | 25 | 35 | 50 | 110 |
All values are in minutes (from the "Space Mountain" wait times example).
Boxplot (Box-and-Whisker Diagram)
Definition
A boxplot (or box-and-whisker diagram) is a graphical representation of a data set, showing the minimum, , median (), , and maximum. It helps visualize the spread and skewness of data.
Procedure for Constructing a Boxplot
Find the 5-number summary.
Draw a line from the minimum to the maximum value.
Draw a box from to , with a line at the median ().
Additional info:
Boxplots are useful for identifying skewness and outliers.
Skewness is present if the boxplot is not symmetric and extends more to one side.
Outliers can be identified using modified boxplots, where values beyond from or are marked as outliers.