BackGraphical Displays and Tables: Describing Data Visually
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Graphical Displays and Tables
1. Categorical Variables
Categorical variables represent data that can be divided into groups or categories. These variables are summarized and displayed using various graphical and tabular methods.
Frequency Tables: Show the count of observations in each category.
Bar Charts: Use rectangles to represent frequencies (bars do not touch).
Pie Charts: Slices are proportional to frequencies.
Frequency (count): The number of observations in each category.
Level | Frequency | Relative Frequency |
|---|---|---|
Freshman | 5 | 0.26 |
Sophomore | 6 | 0.32 |
Junior | 3 | 0.16 |
Senior | 5 | 0.26 |
Sum of frequencies = sample size
Relative frequency / proportion / percent: Tells how many relative to the total. Sum of relative frequencies = 1 (or 100%).
Example: If there are 19 students, and 5 are freshmen, the relative frequency for freshmen is .
2. Quantitative Variables
Quantitative variables are numerical and can be displayed using histograms.
Histogram
Choose bins (intervals or classes) that cover the entire range of data.
Count the number of observations in each interval (frequency).
Draw rectangles: Height = frequency of the interval. For a relative frequency histogram, height = relative frequency of the interval.
Understanding Classes (Bins) and Boundaries:
A class (bin) is a range of values used to group data.
Classes must be written clearly to avoid overlap or ambiguity.
Recommended Format |
|---|
a to less than b |
0 to less than 4 |
4 to less than 8 |
8 to less than 12 |
The lower boundary is included; the upper boundary is excluded.
Placing Observations on Boundaries: An observation equal to a class boundary is placed in the higher class. This applies to whole numbers and decimals.
Example:
3.99 → 0 to < 4
4.00 → 4 to < 8
7.99 → 4 to < 8
8.00 → 8 to < 12
This ensures:
No overlap between classes
Every observation belongs to exactly one class
Notes on Histogram Intervals
Intervals must be non-overlapping and contiguous (rectangles touch each other).
Intervals must be equal width (e.g., width = 5).
Choose "nice" boundaries (e.g., ending in 5 or 0).
Can represent frequency or relative frequency.
Choosing Bin Size / Number of Bins
The goal is to summarize the data well and reveal important features of the distribution.
Too few bins: "pancake" graph (too flat, hides structure).
Too many bins: "skyscraper" graph (too spiky, noisy).
Typical number of bins: 5–20
Simplified 2 to the k Rule (easy version):
Find the smallest k such that , where n = sample size.
Use roughly k bins for the histogram.
Example: n = 19 observations (too small) (okay), use 5 bins
3. Describing a Histogram
Shape
Modes: Peaks of the distribution (unimodal = 1 peak, bimodal = 2 peaks, multimodal = 3+ peaks)
Symmetry: Right and left sides approximately mirror each other
Skewness: Right-skewed (long tail to the right), Left-skewed (long tail to the left)
Bell-shaped: Special case of symmetric distribution
Location
Measure of central tendency (mean, median)
Where is the "middle" of the sample?
Can be difficult to estimate if distribution is skewed
Spread
Range: all observations between a and b
Look for outliers (unusually high or low values)
4. Shape of Distributions
Unimodal / Multimodal: Histogram has one or multiple peaks
Symmetric: Right half mirrors left half
Bell-shaped: Special symmetric distribution
Right-skewed: Long tail to the right
Left-skewed: Long tail to the left
5. Key Takeaways (Homework / Quiz / Test Preparation)
Categorical data:
Know the difference between frequency and relative frequency
Sum of frequencies = sample size; sum of relative frequencies = 1
Recognize when to use bar charts vs. pie charts
Quantitative data:
Understand histogram construction and rules for intervals/bins
Be able to describe histogram shape, location, and spread (including symmetry, skewness, and modes)
Bins:
Number of bins usually between 5–20
"Nice" boundaries for clarity
Apply 2k rule for estimating bins
General:
Label axes and all graphs
Interpret both count and relative frequency histograms
Watch for outliers and unusual observations