Skip to main content
Back

Frequency Distributions, Graphs, and Correlations: Study Notes for Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Frequency Distributions

Definition and Purpose

A frequency distribution is a statistical tool that organizes data into categories or classes, showing how data values are partitioned among these groups. It lists each category (or class) along with the number (frequency) of data values in each.

  • Class: A range of values into which data are grouped.

  • Category: A label or name for a class, often used for qualitative data.

  • Frequency: The count of data values within each class.

Example: If a dataset contains test scores, a frequency distribution might show how many students scored within each score range (e.g., 50-69, 70-89, etc.).

Constructing a Frequency Distribution

To create a frequency distribution, follow these steps:

  1. Select the number of classes: Typically between 5 and 20, depending on the dataset size and convenience.

  2. Calculate class width: Use the formula: Round up to a convenient number if necessary.

  3. Choose the first lower class limit: Start with the minimum value or a convenient value below it.

  4. Determine subsequent lower class limits: Add the class width to the previous lower class limit to get the next one.

  5. List lower and upper class limits: Arrange the lower class limits vertically and identify the corresponding upper class limits.

  6. Tally data values: For each data value, place a tally mark in the appropriate class. Sum the tallies to find the frequency for each class.

Example: For a dataset with values ranging from 50 to 139, and 5 classes, the class width is calculated as: (rounded up to 20 for convenience)

The lower class limits would be 50, 70, 90, 110, and 130.

Table: Frequency Distribution Example

The following table shows a sample frequency distribution for a group:

Class Interval

Frequency

50-69

2

70-89

33

90-109

35

110-129

7

130-149

1

Additional info: The class intervals are determined by the lower and upper class limits, and the frequencies represent the count of data values in each interval.

Graphs and Visual Representations

Types of Graphs

  • Histograms: Bar graphs representing frequency distributions for quantitative data.

  • Dotplots: Each data value is shown as a dot above a number line; stacked dots indicate repeated values.

  • Stem-and-leaf plots: Data values are split into a "stem" (leftmost digit(s)) and a "leaf" (rightmost digit), preserving original data values and showing distribution shape.

  • Time-series graphs: Display data collected over time (e.g., monthly, yearly) to show trends.

  • Pareto charts: Bar charts for categorical data, arranged in descending order of frequency.

  • Pie charts: Circular charts where each slice represents a category's proportion of the total.

Example: A dotplot of pulse rates might show two dots above "50" if two individuals have a pulse rate of 50.

Relative and Cumulative Frequency Distributions

Relative Frequency Distribution

A relative frequency distribution shows the proportion or percentage of data values in each class.

  • Formula:

  • Percentage frequency:

Example: If a class has a frequency of 33 and the total frequency is 78, the relative frequency is or 42.3%.

Cumulative Frequency Distribution

A cumulative frequency distribution shows the sum of frequencies for a class and all previous classes.

  • Useful for determining how many data values fall below a certain threshold.

  • Helps in identifying percentiles and medians.

Example: If the cumulative frequency for "less than 110" is 70, then 70 data values are less than 110.

Shapes of Distributions

Normal and Skewed Distributions

Understanding the shape of a distribution is crucial for interpreting data.

  • Normal distribution: Symmetrical, bell-shaped curve; mean, median, and mode are equal.

  • Skewed distribution: Asymmetrical; can be skewed to the right (positively skewed, longer right tail) or to the left (negatively skewed, longer left tail).

Example: Annual incomes are often right-skewed; human life spans may be left-skewed.

Correlation and Scatter Plots

Correlation

Correlation describes the relationship between two variables. If the values of one variable are associated with the values of another, a correlation exists.

  • Positive correlation: As one variable increases, the other tends to increase.

  • Negative correlation: As one variable increases, the other tends to decrease.

  • No correlation: No discernible pattern between the variables.

  • Important: Correlation does not imply causation.

Example: There may be a positive correlation between hours spent studying and grades, but this does not mean studying directly causes higher grades without considering other factors.

Scatter Plots

A scatter plot is a graph of paired data values, with one variable on each axis. The pattern of points can reveal the type and strength of correlation.

  • Linear correlation: Points approximate a straight line.

  • No correlation: Points are scattered randomly.

Example: A scatter plot of waist and arm circumferences may show a distinct straight-line pattern, indicating correlation. A scatter plot of weights and pulse rates may show no pattern, indicating no correlation.

Summary Table: Types of Frequency Distributions and Graphs

Type

Description

Example

Frequency Distribution

Counts of data values in each class

Test scores grouped by ranges

Relative Frequency Distribution

Proportion or percentage in each class

Percentage of students in each score range

Cumulative Frequency Distribution

Sum of frequencies up to each class

Number of students scoring below a threshold

Histogram

Bar graph for quantitative data

Distribution of heights

Dotplot

Dots above a number line for each value

Pulse rates of individuals

Stem-and-leaf plot

Data split into stems and leaves

Sorted pulse rates

Pareto chart

Bar chart for categorical data, descending order

Causes of accidental deaths

Pie chart

Circular chart showing proportions

Distribution of accident causes

Scatter plot

Graph of paired data values

Waist vs. arm circumference

Additional info: This table summarizes the main types of frequency distributions and graphs used in introductory statistics.

Pearson Logo

Study Prep