BackDescribing Data: Graphs, Charts, and Tables
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Describing Data: Graphs, Charts, and Tables
Frequency Distributions
Frequency distributions are essential tools in statistics for summarizing and organizing data. They display the number of observations in each of the distribution’s distinct categories or classes.
Definition: A frequency distribution is a summary of a set of data that displays the number of observations in each of the distribution’s distinct categories or classes.
Purpose: To provide a clear overview of how data are distributed across different values or intervals.
Discrete Data
Discrete data consist of values that can be counted and typically take on whole numbers.
Definition: Data that can take on a countable number of possible values (e.g., number of customers, number of products sold).
Frequency Distributions for Discrete Distributions
To construct a frequency distribution for discrete data, follow these steps:
List the possible values of the variable.
Count the number of occurrences at each value.
Calculate the proportion of total observations that are in each category (relative frequency).
Relative Frequency
Relative frequency expresses the frequency of a category as a proportion of the total number of observations.
Formula:
Relative frequencies should always sum to 1 (or 100%).
Frequency Distributions for Continuous Distributions
Continuous data can take on any value within a range and are grouped into intervals or classes for frequency distributions.
Mutually Exclusive: Each data point can belong to only one class; classes do not overlap.
All-Inclusive: All data points must be included in the classes, leaving nothing out.
Equal Width: Each class must have the same width (interval size).
Determine the number of groups or classes. The number of classes should be between 5 and 20. Use the formula (where is the number of classes and is the number of data points) to estimate the number of classes.
Find the minimum class width:
Round the class width to a convenient whole number.
Determine the class boundaries for each class.
Count the frequency for each class and fill out the frequency table.
Cumulative Frequency Distributions
Cumulative frequency distributions provide information about the number or proportion of observations below a certain value.
Cumulative Frequency Distribution: A summary table that displays the number of observations with values less than or equal to the upper limit of each class.
Cumulative Relative Frequency Distribution: A summary table that displays the proportion of observations with values less than or equal to the upper limit of each class. The final cumulative relative frequency should always equal 1 (or 100%).
Frequency Histograms
A frequency histogram is a graphical representation of a frequency distribution for quantitative data.
Horizontal axis: Contains the possible outcomes (class intervals) for the variable of interest.
Vertical axis: Shows the frequency for each possible outcome.
Bars: There are no gaps between bars in a histogram, indicating continuous data.
Displays: The center, spread, and shape of the data distribution.
Joint Frequency Tables
Joint frequency tables are used when data are characterized by more than one variable. They can be constructed for both qualitative (categorical) and quantitative variables.
Obtain the data.
Construct the rows and columns of the joint frequency table.
Count the number of joint occurrences at each row and column level for all combinations, and place these frequencies in the appropriate cells.
Types of Data and Corresponding Graphs
Different types of data require different graphical representations. The main types are:
Categorical Data | Quantitative Data |
|---|---|
Bar Charts | Stem and Leaf Diagrams |
Pie Charts | Line Charts |
Scatter Diagrams | |
Pareto Charts |
Bar Charts: Used for categorical data to compare frequencies across categories.
Pie Charts: Used for categorical data to show proportions of a whole.
Stem and Leaf Diagrams: Used for quantitative data to display the distribution while retaining original data values.
Line Charts: Used for quantitative data, especially time series, to show trends over time.
Scatter Diagrams: Used to display the relationship between two quantitative variables.
Pareto Charts: A special type of bar chart where categories are ordered by frequency, often used in quality control.
Example: Constructing a Frequency Distribution
Suppose you have the following data on the number of products sold in 10 days: 5, 7, 8, 5, 6, 7, 8, 9, 5, 6.
Step 1: List possible values (5, 6, 7, 8, 9).
Step 2: Count occurrences for each value.
Step 3: Calculate relative frequencies.
Number Sold | Frequency | Relative Frequency |
|---|---|---|
5 | 3 | 0.3 |
6 | 2 | 0.2 |
7 | 2 | 0.2 |
8 | 2 | 0.2 |
9 | 1 | 0.1 |
Additional info: The example and table were added for clarity and illustration.