Skip to main content
Back

Contingency Tables, Marginal Distributions, and Conditional Distributions in Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Contingency Tables, Marginal Distributions, and Conditional Distributions

Contingency Tables

Contingency tables are a fundamental tool in statistics for summarizing the relationship between two categorical variables. They display the frequency (or count) of observations that fall into each combination of categories.

  • Definition: A contingency table (also called a cross-tabulation or crosstab) is a matrix that shows the frequency distribution of variables.

  • Structure: Rows typically represent one categorical variable, while columns represent another.

  • Purpose: Used to analyze the association between variables and to compute marginal and conditional distributions.

  • Example: A table showing the number of drivers who use cell phones and have speeding violations.

Marginal Distribution

Marginal distributions describe the totals for each category of a single variable, ignoring the other variable. They are obtained by summing across rows or columns of a contingency table.

  • Definition: The marginal distribution of a variable is the distribution of totals for that variable, found by summing across the other variable.

  • Calculation: Add the frequencies in each row or column to get the marginal totals.

  • Relative Marginal Distribution: Divide each marginal total by the overall total to get relative frequencies.

  • Formula:

  • Example: In a table of drivers, the marginal distribution for 'Uses cell phone while driving' is the total number of such drivers divided by the overall total.

Conditional Distribution

Conditional distributions show the distribution of one variable for a fixed value of the other variable. They are useful for understanding how the distribution of one variable changes depending on the category of another.

  • Definition: The conditional distribution of a variable is the distribution of that variable for a specific value of the other variable.

  • Calculation: Divide each cell frequency by the total for the relevant row or column (depending on which variable is conditioned upon).

  • Formula:

  • Example: The proportion of drivers with speeding violations among those who use cell phones while driving.

Worked Examples and Tables

Example 1: Cell Phone Use and Speeding Violations

Speeding Violation in Last Year

No Speeding Violation in Last Year

Frequency (Total)

Uses cell phone while driving

25

280

305

Does not use cell phone while driving

45

405

450

Frequency (Total)

70

685

755

Marginal Distribution (Relative Frequency)

  • Uses cell phone while driving:

  • Does not use cell phone while driving:

  • Total:

Conditional Distribution (Speeding Violation)

  • Among those who use cell phones while driving: (Speeding Violation)

  • Among those who do not use cell phones while driving: (Speeding Violation)

  • Additional info: Conditional distributions can be calculated for either rows or columns, depending on the context.

Example 2: Handedness and Gender

Right-handed

Left-handed

Frequency (Total)

Males

66

15

81

Females

45

9

54

Frequency (Total)

111

24

135

Marginal Distribution (Relative Frequency)

  • Right-handed:

  • Left-handed:

  • Total:

Conditional Distribution (Gender)

  • Among males: (Right-handed), (Left-handed)

  • Among females: (Right-handed), (Left-handed)

Example 3: Opinions by Ethnicity

Good thing

Bad thing

Good and Bad

No Opinion

Relative Frequency

White Respondents

160

110

9

9

Black Respondents

152

35

12

9

Hispanic Respondents

144

33

12

10

Total

456

178

33

28

Marginal Distribution (Relative Frequency)

  • Good thing:

  • Bad thing:

  • Good and Bad:

  • No Opinion:

  • Total:

Conditional Distribution (by Ethnicity)

  • For each ethnic group, divide each cell by the row total to get the conditional distribution.

  • Example for White Respondents: (Good thing), (Bad thing), etc.

Summary Table: Types of Distributions in Contingency Tables

Type

Definition

Calculation

Purpose

Marginal Distribution

Distribution of totals for one variable

Sum across rows or columns

Describes overall distribution of a variable

Conditional Distribution

Distribution of one variable for a fixed value of another

Divide cell by row or column total

Shows how distribution changes with another variable

Key Points

  • Contingency tables organize data for two categorical variables.

  • Marginal distributions summarize totals for each variable.

  • Conditional distributions show the distribution of one variable given a specific value of another.

  • Relative frequencies are calculated by dividing counts by the grand total.

  • Conditional frequencies are calculated by dividing counts by the relevant row or column total.

Additional info: These concepts are foundational for understanding statistical association, independence, and for performing chi-square tests of independence.

Pearson Logo

Study Prep