BackDescriptive Statistics: Graphical Representation of Data
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Descriptive Statistics
Overview
Descriptive statistics involves methods for organizing, displaying, and summarizing data. This section focuses on graphical techniques for representing both quantitative and qualitative data, as well as paired data sets.
Quantitative data: Numerical values that can be measured or counted.
Qualitative data: Categorical data representing characteristics or attributes.
Paired data: Two related sets of quantitative data, often analyzed together.
Graphing Quantitative Data Sets
Stem-and-Leaf Plots
A stem-and-leaf plot is a method for displaying quantitative data where each number is separated into a stem (all but the final digit) and a leaf (the final digit). This plot is similar to a histogram but retains the original data values, making it useful for sorting and identifying patterns.
Stem: Represents the leading digit(s) of each data value.
Leaf: Represents the last digit of each data value.
Provides a quick visual of data distribution and retains actual data values.
Example: For the data set: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45, the stem-and-leaf plot would be:
Stem | Leaf |
|---|---|
2 | 1 5 5 6 7 8 |
3 | 0 6 6 |
4 | 5 |
Additional info: The plot shows clustering of values in the 20s and 30s.
Constructing a Stem-and-Leaf Plot: Step-by-Step
Identify the stems (all but the last digit) and leaves (last digit) for each data value.
List stems in a vertical column and write each leaf to the right of its stem.
Example data set (number of text messages sent):
Number of Text Messages Sent |
|---|
49, 104, 59, 88 |
75, 109, 68, 81 |
80, 78, 69, 55 |
114, 98, 73, 18 |
84, 46, 52, 25 |
26, 33, 25, 20 |
24, 43, 17, 49 |
32, 29, 29, 40 |
33, 30, 41, 35 |
36, 54, 30, 148 |
After organizing, the stem-and-leaf plot reveals that more than 50% of users sent between 20 and 50 messages.
Variations of Stem-and-Leaf Plots
Sometimes, each stem is split into two rows to provide more detail. For example, one row for leaves 0-4 and another for leaves 5-9. This helps to better visualize data distribution within each stem.
First row: leaves 0-4
Second row: leaves 5-9
Additional info: This method is useful for larger data sets with many values per stem.
Dot Plots
A dot plot is a simple way to display quantitative data. Each data value is represented by a dot above a number line. Multiple dots stacked above a value indicate frequency.
Easy to construct and interpret for small data sets.
Shows clusters, gaps, and outliers.
Example: For the data set above, most entries occur between 20 and 80, with only a few above 100, indicating 148 is an outlier.
Graphing Qualitative Data Sets
Pie Charts
A pie chart visually represents categorical data as sectors of a circle, with each sector's area proportional to the category's frequency or percentage of the whole.
Useful for showing relative proportions of categories.
Each sector's central angle is calculated as .
Example: Degrees conferred in 2019:
Type of Degree | Number (thousands) |
|---|---|
Associate's | 1037 |
Bachelor's | 2013 |
Central angle for associate's degree:
Additional info: Pie charts are best for data with few categories.
Pareto Charts
A Pareto chart is a vertical bar graph where bars represent frequencies or relative frequencies of categories, arranged in decreasing order from left to right. The tallest bar is on the left.
Highlights the most significant categories.
Useful for identifying major contributors in categorical data.
Example: Leading causes of death in the U.S. (2019):
Cause | Number of Deaths |
|---|---|
Heart disease | 659,041 |
Cancer | 599,601 |
Accidents | 173,040 |
Chronic lower respiratory disease | 156,979 |
Stroke | 150,005 |
The chart shows heart disease as the leading cause, followed by cancer.
Graphing Paired Data Sets
Scatter Plots
A scatter plot displays paired quantitative data as points on a coordinate plane, with each point representing an ordered pair. Scatter plots are used to examine relationships between two variables.
Each axis represents one variable.
Patterns may indicate correlation (positive, negative, or none).
Example: Fisher's Iris data set plots petal length vs. petal width. As petal length increases, petal width tends to increase, indicating a positive relationship.
Time Series Charts
A time series chart graphs quantitative data collected at regular intervals over time. The horizontal axis represents time, and the vertical axis represents the measured variable.
Useful for identifying trends, cycles, and patterns over time.
Data points are connected by line segments.
Example: Number of motor vehicle thefts and burglaries in the U.S. from 2009 to 2019. The chart shows burglaries remained steady until 2011, then decreased.
Summary Table: Graph Types and Their Uses
Graph Type | Data Type | Main Purpose |
|---|---|---|
Stem-and-leaf plot | Quantitative | Sort and display original data values |
Dot plot | Quantitative | Show frequency and distribution |
Pie chart | Qualitative | Show proportions of categories |
Pareto chart | Qualitative | Highlight most significant categories |
Scatter plot | Paired Quantitative | Show relationship between two variables |
Time series chart | Quantitative over time | Show trends and changes over time |