BackExploring Data with Tables and Graphs: Elementary Statistics Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 2: Exploring Data with Tables and Graphs
Introduction
This chapter introduces essential graphical and tabular methods for organizing, summarizing, and interpreting statistical data. Understanding these visual tools is crucial for effective data analysis and communication in statistics.
Frequency Distributions
Definition and Purpose
Frequency distribution is a table that displays the number of data values (frequency) within specific intervals or categories.
Helps organize large data sets and reveals patterns such as central tendency, spread, and shape.
Graphs that Enlighten: Dotplots
Dotplots
A dotplot is a graph of quantitative data in which each data value is plotted as a point (dot) above a horizontal scale of values.
Dots representing equal values are stacked vertically.
Features of a Dotplot
Displays the shape of the distribution of data.
It is usually possible to recreate the original list of data values from the plot.
Example
A dotplot of pulse rates for males shows the distribution and frequency of different pulse rate values.
Graphs that Enlighten: Stemplots
Stemplots (Stem-and-Leaf Plots)
A stemplot (or stem-and-leaf plot) represents quantitative data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit).
Allows for quick visualization of data distribution and retention of original data values.
Features of a Stemplot
Shows the shape of the distribution of the data.
Retains the original data values.
The sample data are sorted (arranged in order).
Example
A stemplot of pulse rates might have stems for tens digits and leaves for units digits, e.g., stem '6' with leaves '1 0 0 0 0 0 2 2 2 2 2 2 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 9 0 9 8 0'.
Time-Series Graphs
Definition and Application
A time-series graph is a graph of time-series data, which are quantitative values collected at different points in time (e.g., monthly or yearly).
Used to analyze trends, cycles, and patterns over time.
Feature of a Time-Series Graph
Reveals information about trends over time, such as increases, decreases, or periodic fluctuations.
Example
A time-series graph of law enforcement fatalities from 1985 to 2015 shows how the number changes year by year.
Bar Graphs
Definition and Features
A bar graph uses bars of equal width to show frequencies of categories of categorical (qualitative) data.
Bars may or may not be separated by small gaps.
Feature of a Bar Graph
Shows the relative distribution of categorical data, making it easier to compare different categories.
Pareto Charts
Definition and Features
A Pareto chart is a bar graph for categorical data, with bars arranged in descending order according to frequencies.
Bars decrease in height from left to right, highlighting the most significant categories.
Features of a Pareto Chart
Shows the relative distribution of categorical data for easy comparison.
Draws attention to the more important categories.
Example Table: Causes of Fatal Plane Crashes (Pareto Chart)
Cause | Frequency |
|---|---|
Pilot Error | Highest |
Mechanical Sabotage | Medium |
Weather | Lower |
Other | Lowest |
Pie Charts
Definition and Features
A pie chart depicts categorical data as slices of a circle, with the size of each slice proportional to the frequency count for the category.
Feature of a Pie Chart
Shows the distribution of categorical data in a commonly used, visually intuitive format.
Example Table: Causes of Fatal Plane Crashes (Pie Chart)
Cause | Proportion |
|---|---|
Pilot Error | Largest slice |
Mechanical Sabotage | Medium slice |
Weather | Smaller slice |
Other | Smallest slice |
Frequency Polygons
Definition and Features
A frequency polygon is a graph using line segments connected to points located directly above class midpoint values.
Similar to a histogram, but uses line segments instead of bars.
Relative Frequency Polygon
A variation that uses relative frequencies (proportions or percentages) for the vertical scale.
Example Table: Frequency Polygon of Commute Times
Commute Time (min) | Frequency |
|---|---|
0-10 | 2 |
10-20 | 5 |
20-30 | 7 |
30-40 | 3 |
Additional info: Table values inferred for illustration.
Graphs That Deceive
Nonzero Vertical Axis
Using a vertical scale that starts at a value greater than zero can exaggerate differences between groups.
Always examine a graph carefully to see whether the vertical axis begins at zero; otherwise, differences may be misleading.
Pictographs
Pictographs use drawings of objects to represent data, which can be misleading if the data are one-dimensional but depicted with two- or three-dimensional objects.
Doubling the sides of a square increases its area by a factor of four, not two; doubling the sides of a cube increases its volume by a factor of eight, not two.
Such representations can grossly distort differences in the data.
Example Table: Pictograph of NSA Collected Phone Records
Year | Records Collected (millions) |
|---|---|
Year 1 | 151 |
Year 2 | 534 |
Best Practices for Graphical Display
Principles for Effective Graphs
For small data sets (20 values or fewer), use a table instead of a graph.
A graph should focus on the true nature of the data, not on distracting design features.
Do not distort data; construct graphs to reveal the true nature of the data.
Most of the ink in a graph should be used for the data, not for other design elements.
Additional info: Principles adapted from Edward Tufte's guidelines for data visualization.
Summary Table: Types of Graphs and Their Uses
Graph Type | Data Type | Main Use |
|---|---|---|
Dotplot | Quantitative | Distribution shape, individual values |
Stemplot | Quantitative | Distribution shape, retains data |
Time-Series Graph | Quantitative (over time) | Trends and patterns |
Bar Graph | Categorical | Compare categories |
Pareto Chart | Categorical | Highlight most important categories |
Pie Chart | Categorical | Show proportions |
Frequency Polygon | Quantitative | Distribution shape |
Key Formulas
Relative Frequency
Relative frequency of a class:
Class Midpoint
Class midpoint:
Conclusion
Effective use of tables and graphs is fundamental in statistics for summarizing, analyzing, and communicating data. Understanding both enlightening and deceptive graphical techniques is essential for accurate interpretation and presentation of statistical information.