BackSummarizing and Graphing Data (Introductory Statistics, Ch. 2)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Summarizing and Graphing Data
Why We Graph Data
Graphing data is a fundamental aspect of statistics, allowing for the organization, communication, and interpretation of data sets. Graphs help reveal patterns, trends, and unusual behaviors within the data.
Organization: Graphs structure data for easier analysis.
Communication: Visual representations make data more accessible.
Revelation: Graphs can highlight behaviors and outliers in the data.
Key Terms
Frequency: The number of times a value occurs in a data set.
Variation: The degree to which data values differ from each other.
Distribution: The pattern of data values over the possible range.
Outliers/Unusual Values: Data points that fall outside the majority of values.
Frequency Distributions
Frequency Distribution Table
A frequency distribution partitions a data set into several classes and lists the number of values in each class. This method does not retain the original data values but provides a summary for analysis.
Units Produced | Frequency |
|---|---|
1-10 | 3 |
11-20 | 34 |
21-30 | 45 |
31-40 | 48 |
41-50 | 56 |
51-60 | 59 |
61-70 | 52 |
71-80 | 42 |
81-90 | 31 |
Example: Pulse Rates
Given pulse rates (beats per minute) of 40 females, a frequency distribution can be constructed to summarize the data:
Pulse Rate | Frequency |
|---|---|
60-69 | 12 |
70-79 | 14 |
80-89 | 11 |
90-99 | 1 |
100-109 | 1 |
110-119 | 0 |
120-129 | 1 |
Parts of a Frequency Distribution
Frequency: Number of values in each class.
Lower Class Limits: Smallest value in each class.
Upper Class Limits: Largest value in each class.
Class Boundaries
Class boundaries are values used to separate classes in a frequency distribution. For example, boundaries for pulse rates might be 59.5, 69.5, 79.5, etc.
Class Midpoints
The midpoint of a class is the average of its lower and upper limits. For example, the midpoint of 60-69 is .
Pulse Rate | Midpoint | Frequency |
|---|---|---|
60-69 | 64.5 | 12 |
70-79 | 74.5 | 14 |
80-89 | 84.5 | 11 |
90-99 | 94.5 | 1 |
100-109 | 104.5 | 1 |
110-119 | 114.5 | 0 |
120-129 | 124.5 | 1 |
Class Width
Class width is the difference between consecutive lower (or upper) class limits. For example, .
Constructing a Frequency Distribution
Sort the data and determine the number of classes (typically 5-20).
Calculate class width: If the result is not a whole number, round up.
Choose the minimum value as the first lower class limit.
List all lower and upper class limits using the class width.
Count and enter the frequency for each class.
Example: Frequency Distribution Construction
Given bear weights, construct a frequency distribution with 6 classes and include midpoints.
Relative and Cumulative Frequency Distributions
Relative Frequency Distribution
Relative frequency is the proportion of the total frequency for each class.
Calculated as:
Example: 12 out of 40 is or 30%.
Pulse Rate | Relative Frequency |
|---|---|
60-69 | 30% |
70-79 | 35% |
80-89 | 27.5% |
90-99 | 2.5% |
100-109 | 2.5% |
110-119 | 0% |
120-129 | 2.5% |
Cumulative Frequency Distribution
Cumulative frequency is the sum of the current and all previous frequencies.
Pulse Rate | Cumulative Frequency |
|---|---|
Less than 70 | 12 |
Less than 80 | 26 |
Less than 90 | 37 |
Less than 100 | 38 |
Less than 110 | 39 |
Less than 120 | 39 |
Less than 130 | 40 |
Data Types and Graphs
Quantitative vs. Categorical Data
Quantitative Data: Numerical values representing counts or measurements.
Categorical Data: Names or labels that do not represent counts or measurements.
Frequency Distributions for Both Data Types
Weight (lbs) of Wild Bears | Frequency |
|---|---|
26-95 | 5 |
96-165 | 6 |
166-235 | 7 |
236-305 | 1 |
306-375 | 4 |
376-445 | 2 |
Color | Frequency |
|---|---|
Red | 23 |
Orange | 12 |
Yellow | 7 |
Green | 19 |
Blue | 26 |
Purple | 1 |
Dot Plots
A dot plot displays each data value as a dot along a scale. Dots representing equal values are stacked. Dot plots can be used for both quantitative and categorical data.
Graphs and Charts for Quantitative Data
Histogram
A histogram uses adjacent bars of equal width to represent frequencies of quantitative data classes. The horizontal axis shows classes or midpoints, and the vertical axis shows frequencies. Histograms do not maintain original data values.
Analyzing Histograms
Histograms reveal the shape of the data distribution.
Distributions may be symmetric or skewed.
Skewed vs. Symmetric Distributions
Symmetric: The distribution matches if folded in half horizontally.
Skewed: The distribution is not symmetric.
Left and Right Skew
Skewed Left (Negative): Tail on the left side.
Skewed Right (Positive): Tail on the right side.
Relative Frequency Histogram
Similar to a histogram, but the vertical axis shows relative frequencies instead of raw counts.
Frequency Polygon
A frequency polygon connects points above class midpoints with line segments. A relative frequency polygon uses relative frequencies for the vertical axis.
Ogive
An ogive is a line graph that depicts cumulative frequencies, useful for understanding how frequencies accumulate across classes.
Stemplot
A stemplot separates each data value into a stem (leftmost digits) and a leaf (rightmost digit), providing a quick visual of data distribution.
Graphs and Charts for Categorical Data
Bar Graph
Bar graphs use bars of equal width with gaps between them. The horizontal axis identifies categories, and the vertical axis shows frequencies. Multiple bar graphs compare two or more data sets.
Histogram vs. Bar Graph
Histogram: No gaps between bars; used for quantitative data.
Bar Graph: Gaps between bars; used for categorical data.
Pareto Chart
A bar graph with bars arranged in descending order of frequency, often used to highlight the most significant categories.
Pie Chart
Pie charts represent categorical data as slices of a circle, with slice size proportional to frequency.
Best Practices for Graphs
Include a clear title.
Label axes and categories.
Ensure accuracy and clarity.
Present data at the forefront for interpretation.
Example Applications
Constructing frequency distributions for animal weights.
Analyzing pulse rates using histograms and polygons.
Comparing categorical preferences with bar and pie charts.
Additional info: These notes cover foundational concepts in summarizing and graphing data, including frequency tables, histograms, and categorical charts, as outlined in a typical college-level statistics course.