BackChapter 3: Constructing Graphical and Tabular Displays of Data
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Constructing Graphical and Tabular Displays of Data (3.3)
Variables
In statistics, variables are classified based on the type of values they can assume. Understanding the distinction between discrete and continuous variables is fundamental for selecting appropriate graphical and tabular methods for data display.
Discrete variable: A variable that has gaps between successive, possible values. Examples include the number of times a person has traveled to a location (e.g., 0, 1, 2, 3).
Continuous variable: A variable that can take on any value between two possible values. For example, the time (in seconds) it takes a person to run 100 meters can be any real number within a range (e.g., 10.59329 seconds).
Example: Identifying Variable Types
Continuous: Time to run 100 meters (can be any value within a range).
Discrete: Number of trips to the Grand Canyon (can only be whole numbers).
Dotplots
A dotplot is a simple graphical display for numerical data, where each data value is represented by a dot above its position on a number line. Dotplots are useful for visualizing the distribution, frequency, and outliers in small datasets.
Construction: Write the range of values on a number line, then stack dots above each value for every occurrence in the dataset.
Interpretation: The number of dots above a value indicates its frequency. Outliers and clusters can be easily identified.
Example: Interpreting a Dotplot
Most frequent observation: The value with the most dots (e.g., 85 points occurred most frequently).
Pass/Fail cutoff: Count dots to the left of a threshold (e.g., 7 students scored at most 69 points and did not pass).
Proportion calculation: Proportion of students earning an 'A' (e.g., 13 out of 35 students scored at least 90 points, so the proportion is ).
Outliers: Observations much lower or higher than the rest (e.g., 38 points is an outlier).
Frequency and Frequency Distribution
Frequency: The number of times an observation occurs in the dataset.
Frequency distribution: A summary of all observations and their corresponding frequencies.
Outliers and Percentiles
Outlier: An observation that is significantly smaller or larger than the rest of the data.
Percentile: The kth percentile is a value greater than or equal to approximately k% of the observations and less than approximately (100 – k)% of the observations.
Example: Calculating Percentiles
23rd percentile: If 8 out of 35 scores are less than or equal to 70, then 70 is at the 23rd percentile ().
50th percentile (median): The value at the middle position when data is ordered (e.g., 85 points is the 18th out of 35 scores).
15th percentile: The value at the position (round to 5th score), e.g., 65 points.
Stemplots (Stem-and-Leaf Plots)
A stemplot is a graphical method that splits each data value into a 'stem' (all but the rightmost digit) and a 'leaf' (the rightmost digit). Stemplots are useful for visualizing the shape and distribution of small datasets.
Construction: List stems in a column and attach leaves in rows, ordered from smallest to largest.
Example: For the value 375, stem = 37, leaf = 5.
Stem (tens) | Leaf (ones) |
|---|---|
3 | 8 |
4 | 1 |
5 | 0 2 5 5 8 8 |
6 | 0 3 5 5 8 |
7 | 0 3 5 5 8 8 |
8 | 0 0 3 5 5 5 8 8 8 |
9 | 0 0 3 3 3 5 5 8 8 |
10 | 0 0 0 |
Time-Series Plots
A time-series plot displays data points in chronological order, with time on the horizontal axis and the variable of interest on the vertical axis. Successive points are connected by line segments to show trends over time.
Purpose: To visualize changes in a variable over time, identify patterns, and detect trends or cycles.
Example: Plotting annual revenue of products over several years.
Key Steps in Constructing a Time-Series Plot
Label the horizontal axis with time intervals (e.g., years).
Label the vertical axis with the variable being measured.
Plot each data point at the intersection of time and value.
Connect successive points with line segments.
Additional info: These graphical and tabular methods are foundational for descriptive statistics and are essential for summarizing, visualizing, and interpreting data distributions in introductory statistics courses.