BackContingency Tables, Marginal Distributions, and Conditional Distributions in Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Contingency Tables, Marginal Distributions, and Conditional Distributions
Contingency Tables
Contingency tables are a fundamental tool in statistics for summarizing the relationship between two categorical variables. They display the frequency (or count) of observations that fall into each combination of categories.
Definition: A contingency table (also called a cross-tabulation or crosstab) is a matrix that shows the frequency distribution of variables.
Structure: Rows typically represent one categorical variable, while columns represent another.
Purpose: Used to analyze the association between variables and to compute marginal and conditional distributions.
Example: A table showing the number of drivers who use cell phones and have speeding violations.
Marginal Distribution
Marginal distributions describe the totals for each category of a single variable, ignoring the other variable. They are obtained by summing across rows or columns of a contingency table.
Definition: The marginal distribution of a variable is the distribution of totals for that variable, found by summing across the other variable.
Calculation: Add the frequencies in each row or column to get the marginal totals.
Relative Marginal Distribution: Divide each marginal total by the overall total to get relative frequencies.
Formula:
Example: In a table of drivers, the marginal distribution for 'Uses cell phone while driving' is the total number of such drivers divided by the overall total.
Conditional Distribution
Conditional distributions show the distribution of one variable for a fixed value of the other variable. They are useful for understanding how the distribution of one variable changes depending on the category of another.
Definition: The conditional distribution of a variable is the distribution of that variable for a specific value of the other variable.
Calculation: Divide each cell frequency by the total for the relevant row or column (depending on which variable is conditioned upon).
Formula:
Example: The proportion of drivers with speeding violations among those who use cell phones while driving.
Worked Examples and Tables
Example 1: Cell Phone Use and Speeding Violations
Speeding Violation in Last Year | No Speeding Violation in Last Year | Frequency (Total) | |
|---|---|---|---|
Uses cell phone while driving | 25 | 280 | 305 |
Does not use cell phone while driving | 45 | 405 | 450 |
Frequency (Total) | 70 | 685 | 755 |
Marginal Distribution (Relative Frequency)
Uses cell phone while driving:
Does not use cell phone while driving:
Total:
Conditional Distribution (Speeding Violation)
Among those who use cell phones while driving: (Speeding Violation)
Among those who do not use cell phones while driving: (Speeding Violation)
Additional info: Conditional distributions can be calculated for either rows or columns, depending on the context.
Example 2: Handedness and Gender
Right-handed | Left-handed | Frequency (Total) | |
|---|---|---|---|
Males | 66 | 15 | 81 |
Females | 45 | 9 | 54 |
Frequency (Total) | 111 | 24 | 135 |
Marginal Distribution (Relative Frequency)
Right-handed:
Left-handed:
Total:
Conditional Distribution (Gender)
Among males: (Right-handed), (Left-handed)
Among females: (Right-handed), (Left-handed)
Example 3: Opinions by Ethnicity
Good thing | Bad thing | Good and Bad | No Opinion | Relative Frequency | |
|---|---|---|---|---|---|
White Respondents | 160 | 110 | 9 | 9 | |
Black Respondents | 152 | 35 | 12 | 9 | |
Hispanic Respondents | 144 | 33 | 12 | 10 | |
Total | 456 | 178 | 33 | 28 |
Marginal Distribution (Relative Frequency)
Good thing:
Bad thing:
Good and Bad:
No Opinion:
Total:
Conditional Distribution (by Ethnicity)
For each ethnic group, divide each cell by the row total to get the conditional distribution.
Example for White Respondents: (Good thing), (Bad thing), etc.
Summary Table: Types of Distributions in Contingency Tables
Type | Definition | Calculation | Purpose |
|---|---|---|---|
Marginal Distribution | Distribution of totals for one variable | Sum across rows or columns | Describes overall distribution of a variable |
Conditional Distribution | Distribution of one variable for a fixed value of another | Divide cell by row or column total | Shows how distribution changes with another variable |
Key Points
Contingency tables organize data for two categorical variables.
Marginal distributions summarize totals for each variable.
Conditional distributions show the distribution of one variable given a specific value of another.
Relative frequencies are calculated by dividing counts by the grand total.
Conditional frequencies are calculated by dividing counts by the relevant row or column total.
Additional info: These concepts are foundational for understanding statistical association, independence, and for performing chi-square tests of independence.