BackOrganizing and Displaying Data in Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Organizing Qualitative Data
Introduction
Organizing qualitative data is a fundamental step in statistical analysis. Qualitative data, also known as categorical data, represent characteristics or attributes that can be grouped into categories. Proper organization allows for effective summarization and visualization, making it easier to interpret and analyze the data.
Frequency Distributions
Frequency Distribution: Lists each category of data and the number of occurrences for each category.
Relative Frequency: The proportion (or percent) of observations within a category, calculated as:
Relative Frequency Distribution: Lists each category of data together with its relative frequency.
Example: Organizing Qualitative Data into a Frequency Distribution
A survey asked individuals about their favorite day of the week. The data can be organized into a frequency and relative frequency distribution table.
Constructing Bar Graphs
Bar Graph: Constructed by labeling each category of data on one axis and the frequency or relative frequency on the other. Rectangles (bars) are drawn for each category, with heights representing the frequencies.
Pareto Chart: A bar graph where bars are drawn in decreasing order of frequency or relative frequency.
Example: Constructing a Frequency and Relative Frequency Bar Graph
Use the frequency distribution of survey data (e.g., best day of the week) to construct bar graphs and Pareto charts.
Constructing Side-by-Side Bar Graphs
Side-by-Side Bar Graph: Used to compare two or more groups for the same categories. Each group is represented by a different color or pattern within each category.
Relative frequencies are often used to allow for comparison between groups of different sizes.
Example: Children Under 18 Living with One Parent
Compare the proportion of children living with only their father or mother across different age groups using a side-by-side bar graph.
Pie Charts
Pie Chart: A circular chart divided into sectors, each representing a category. The area of each sector is proportional to the frequency of the category.
Example: Drawing a Pie Chart
Construct a pie chart for the best day of the week or generation from survey data.
Graph Comparisons
Bar graphs are preferred when comparing the frequency of categories.
Pie charts are useful for showing the proportion of each category relative to the whole.
Bar graphs can be used when categories are numerous or when negative values are present, which is not possible with pie charts.
Organizing Quantitative Data
Introduction
Quantitative data represent numerical values and can be either discrete (countable) or continuous (measurable). Organizing quantitative data involves grouping values into classes or intervals and summarizing them in tables or graphs.
Organizing Discrete Data in Tables
Discrete Data: Data that can take on only specific, separate values (e.g., number of siblings).
Frequency and relative frequency distributions can be constructed for discrete data.
Example: Frequency Distribution of Number of Siblings
Survey data on the number of siblings can be organized into a frequency and relative frequency table.
Histograms for Discrete Data
Histogram: A graphical representation of the frequency distribution of discrete data. Rectangles are drawn for each class, with heights representing frequencies. The rectangles touch each other to indicate the continuity of the data.
Example: Drawing a Histogram for Discrete Data
Construct a histogram for the number of siblings or hours worked using survey data.
Organizing Continuous Data in Tables
Continuous Data: Data that can take on any value within a range.
Classes: Categories into which data are grouped, defined by lower and upper class limits.
Classes should not overlap and are often open-ended at the extremes.
Example Table: Educational Attainment by Age
Age | Total | Percent with High School Diploma | Percent with Some College | Percent with Associate's Degree | Percent with Bachelor's Degree | Percent with Master's Degree | Percent with Doctoral Degree |
|---|---|---|---|---|---|---|---|
25-34 | 44,521 | 89.4 | 23.3 | 13.4 | 10.3 | 25.1 | 9.8 |
35-54 | 48,831 | 23.9 | 15.7 | 10.8 | 25.1 | 12.5 | 9.8 |
55 and older | 39,871 | 28.9 | 15.6 | 10.6 | 19.4 | 9.2 | 9.2 |
Histograms for Continuous Data
Histograms for continuous data use intervals (classes) on the horizontal axis and frequencies on the vertical axis.
All rectangles have the same width, and they touch each other to indicate continuity.
Example: Drawing a Histogram for Unemployment Data
Construct a histogram for unemployment rates by state, using appropriate class intervals.
Dot Plots
Dot Plot: A simple graph where each observation is plotted as a dot above its value on a number line. Useful for small data sets.
Example: Drawing a Dot Plot
Draw a dot plot for the number of siblings or age from survey data.
Identifying the Shape of a Distribution
Uniform Distribution: All values occur with approximately the same frequency.
Bell-Shaped Distribution: Most values cluster around a central peak, with frequencies tapering off symmetrically.
Skewed Right: The right tail (higher values) is longer than the left.
Skewed Left: The left tail (lower values) is longer than the right.
Caution: Do not describe qualitative data as skewed, left, right, or uniform.
Example: Identifying Distribution Shape
Use a histogram to determine if the data are uniform, bell-shaped, or skewed.
Additional Graphical Methods
Stem-and-Leaf Plots
Stem-and-Leaf Plot: Represents quantitative data by splitting each value into a "stem" (all but the final digit) and a "leaf" (the final digit).
Steps to construct:
Treat the integer portion as the stem and the decimal as the leaf.
Write stems in ascending order and draw a vertical line to the right.
Write leaves corresponding to each stem.
Order leaves in increasing order for each stem.
Example: Creating a Stem-and-Leaf Plot
Construct a stem-and-leaf plot for unemployment data or hours worked from survey data.
Frequency Polygons and Time-Series Graphs
Frequency Polygon: A line graph that connects the midpoints of the tops of the bars of a histogram.
Time-Series Graph: Plots data points in chronological order, useful for displaying trends over time.
Cumulative Frequency and Relative Frequency Tables
Cumulative Frequency: The sum of the frequencies for all classes up to a certain class.
Cumulative Relative Frequency: The sum of the relative frequencies for all classes up to a certain class.
Summary Table: Types of Graphs and Their Uses
Graph Type | Data Type | Main Use |
|---|---|---|
Bar Graph | Qualitative | Compare frequencies of categories |
Pareto Chart | Qualitative | Highlight most frequent categories |
Pie Chart | Qualitative | Show proportion of each category |
Histogram | Quantitative | Show distribution of data |
Dot Plot | Quantitative | Display individual data points |
Stem-and-Leaf Plot | Quantitative | Show data distribution and retain original values |
Frequency Polygon | Quantitative | Compare distributions |
Time-Series Graph | Quantitative (over time) | Show trends over time |
Additional info: This guide covers the foundational methods for organizing and displaying both qualitative and quantitative data in statistics, including definitions, examples, and the construction of various graphs and tables.