BackDescribing Data with Tables, Graphs, and Numerical Measures
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Describing Data with Tables and Graphs
Class Intervals, Frequency, and Relative Frequency
Organizing raw data into tables is a foundational step in statistical analysis. Frequency tables summarize how often each value or range of values occurs in a dataset.
Class Limits: The lower and upper boundaries for each interval.
Frequency: The number of data points within each class interval.
Relative Frequency: The proportion of data points in each class interval, calculated as:
Class Limits | Frequency | Relative Frequency |
|---|---|---|
1-8 | 14 | 0.25 |
9-16 | 21 | 0.38 |
17-24 | 11 | 0.20 |
25-32 | 4 | 0.07 |
33-40 | 4 | 0.07 |
41-48 | 1 | 0.02 |
Additional info: Relative frequencies are rounded to two decimal places and should sum to 1.
Class Midpoints
The midpoint of a class interval is used for graphical displays and calculations.
Class Midpoint Formula:
Class Limits | Class Midpoint |
|---|---|
1-8 | 4.5 |
9-16 | 12.5 |
17-24 | 20.5 |
25-32 | 28.5 |
33-40 | 36.5 |
41-48 | 44.5 |
Class Boundaries
Class boundaries are used to avoid gaps between intervals when drawing histograms.
To find upper class boundaries, add 0.5 to the upper class limit.
To find lower class boundaries, subtract 0.5 from the lower class limit.
Histograms and Frequency Polygons
Histograms visually represent the distribution of data using bars. Frequency polygons use points connected by lines to show frequencies.
Place class boundaries on the horizontal axis.
Plot frequencies or relative frequencies on the vertical axis.
Bars or points represent the frequency for each class interval.
Cumulative Frequency
Cumulative frequency is the running total of frequencies through the classes.
Frequency | Cumulative Frequency |
|---|---|
43 | 43 |
23 | 66 |
Additional info: Add each frequency to the sum of previous frequencies.
Describing Data Numerically
Stem-and-Leaf Displays
Stem-and-leaf plots organize data to show its shape and distribution while retaining original values.
Split each number into a stem (left part) and leaf (right part).
List stems in a column, leaves in rows.
Order leaves from smallest to largest.
Label the plot to indicate the value of stems and leaves.
Example: Weights of carry-on luggage in pounds are displayed using a stem-and-leaf plot.
Measures of Central Tendency
Mean
The mean is the arithmetic average of a dataset.
Sum all data values:
Divide by the number of data values:
Median
The median is the middle value in an ordered dataset.
Order data from smallest to largest.
If odd number of values, median is the middle value.
If even number of values, median is the average of the two middle values:
Mode
The mode is the value that occurs most frequently in a dataset.
If no value repeats, the dataset has no mode.
If multiple values repeat with the same highest frequency, the dataset is multimodal.
Trimmed Mean
A trimmed mean removes a specified percentage of the smallest and largest values before calculating the mean, reducing the effect of outliers.
Order data from smallest to largest.
Remove the lowest and highest values according to the trimming percentage.
Calculate the mean of the remaining data.
Example: For a 5% trimmed mean, remove the lowest and highest 5% of values before computing the mean.
Additional info:
These notes cover foundational techniques for organizing and summarizing data, which are essential for further statistical analysis.
Examples and tables are based on sample data such as airline carry-on luggage weights.