Skip to main content
Back

Chapter 3: Displaying and Summarizing Quantitative Data – Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Displaying and Summarizing Quantitative Data

Displays for Quantitative Variables

Quantitative data are best understood through graphical displays that reveal their distribution, shape, and key features. Unlike categorical data, which use bar charts or pie charts, quantitative data require specialized displays such as histograms, stem-and-leaf plots, and dotplots.

  • Histogram: Divides the range of data into equal-width bins and plots the frequency of cases in each bin. Useful for visualizing the distribution and identifying patterns such as central tendency and spread.

  • Stem-and-Leaf Display: Shows individual data values while also revealing the distribution. Each value is split into a "stem" (leading digits) and a "leaf" (trailing digit).

  • Dotplot: Places a dot for each data point along an axis, providing a simple visual summary.

  • Quantitative Data Condition: Always ensure the data are quantitative and units are known before choosing a display.

Histogram of earthquake magnitudes

Shape of Distributions

The shape of a distribution provides insight into the underlying data structure. Key aspects include the number of modes, symmetry, skewness, and the presence of unusual features.

  • Modes: Unimodal (one peak), bimodal (two peaks), or multimodal (three or more peaks).

  • Uniform: All bars are approximately the same height, indicating no clear mode.

  • Symmetry: A symmetric histogram can be folded along its center with matching edges.

  • Skewness: If one tail is longer, the distribution is skewed (left or right).

  • Outliers and Gaps: Outliers are values far from the main body; gaps may indicate multiple groups.

Centre of a Distribution

The center of a distribution is typically measured by the median or mean. The median divides the data into two equal halves, while the mean is the arithmetic average.

  • Median: The middle value when data are ordered. For odd n, it is the value at position . For even n, it is the average of the two middle values.

  • Mean: Calculated as , where are the data values and is the number of values.

  • Choosing Centre: Use the mean for symmetric distributions; use the median for skewed distributions or when outliers are present.

Histogram showing median

Spread of a Distribution

Spread describes how much the data values vary. Common measures include the range, interquartile range (IQR), and standard deviation.

  • Range: Difference between maximum and minimum values.

  • Interquartile Range (IQR): The range of the middle 50% of data.

  • Quartiles: (25th percentile), (median, 50th percentile), (75th percentile)

  • Percentiles: The value below which a given percentage of data falls.

Histogram showing IQR

Boxplots and 5-Number Summaries

The five-number summary consists of the minimum, , median, , and maximum. A boxplot graphically displays these values and highlights outliers.

  • Boxplot Construction: Draw a box from to , mark the median, extend whiskers to the most extreme values within 1.5 IQRs, and mark outliers beyond the fences.

  • Comparisons: Boxplots are useful for comparing distributions across groups.

Boxplot and histogram of earthquake magnitudes

The Mean and Symmetric Distributions

For symmetric, unimodal distributions, the mean is a reliable measure of center. The mean is also the balancing point of the histogram.

  • Mean Formula:

  • Balancing Point: The mean is the point at which the histogram would balance if placed on a fulcrum.

  • Mean vs. Median: In symmetric distributions, mean ≈ median. In skewed distributions, the mean is pulled toward the tail.

Histogram and boxplot showing mean as balancing point

Standard Deviation and Spread

The standard deviation measures the average distance of data values from the mean, providing a more comprehensive measure of spread for symmetric distributions.

  • Variance:

  • Standard Deviation:

  • Empirical Rule: For symmetric distributions:

    • ~68% of data within 1 standard deviation of the mean

    • ~95% within 2 standard deviations

    • ~99.7% within 3 standard deviations

  • Effect of Outliers: Outliers increase the standard deviation but may not substantially affect the IQR.

Summary: What to Tell About a Quantitative Variable

When describing a quantitative variable, always provide:

  • Shape: Unimodal, bimodal, symmetric, skewed, uniform

  • Centre: Mean or median, depending on distribution shape

  • Spread: Range, IQR, and/or standard deviation

  • Unusual Features: Outliers, gaps, multiple modes

Choose the appropriate summary statistics based on the distribution's shape and presence of outliers.

Examples and Applications

Examples throughout the chapter illustrate how to compute and interpret these statistics for real-world data sets, such as earthquake magnitudes, sports statistics, and bird counts.

Histogram of bird species counts in StatCrunch

Additional info: These notes expand on the original content by providing definitions, formulas, and context for each statistical concept, ensuring a self-contained study guide for exam preparation.

Pearson Logo

Study Prep