Skip to main content
Back

Association Between Two Categorical Variables: Contingency Tables and Proportions

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 3: Association – Contingency, Correlation, and Regression

Section 3.1: Exploring the Association Between Two Categorical Variables

This section introduces foundational concepts for analyzing the association between two categorical variables in statistics. It covers variable types, the definition of association, contingency tables, and the calculation of proportions and conditional proportions.

Learning Objectives

  • Identify variable type: Response or Explanatory

  • Define Association

  • Understand and construct Contingency Tables

  • Calculate Proportions and Conditional Proportions

Response and Explanatory Variables

In statistical analysis, it is crucial to distinguish between the response and explanatory variables when comparing groups or outcomes.

  • Response variable (Dependent Variable): The outcome variable on which comparisons are made.

  • Explanatory variable (Independent Variable): The variable that defines the groups to be compared with respect to values on the response variable.

  • Examples:

    • Blood alcohol level (response) / Number of beers consumed (explanatory)

    • Grade on test (response) / Amount of study time (explanatory)

    • Yield of corn per bushel (response) / Amount of rainfall (explanatory)

Association

The main purpose of data analysis with two variables is to investigate whether there is an association and to describe that association.

  • Association: An association exists between two variables if a particular value for one variable is more likely to occur with certain values of the other variable.

  • Detecting association helps in understanding relationships and dependencies between variables.

Contingency Tables

A contingency table is a tabular display that shows the frequency distribution of variables. It is a fundamental tool for summarizing the relationship between two categorical variables.

  • Displays two categorical variables.

  • The rows list the categories of one variable.

  • The columns list the categories of the other variable.

  • Entries in the table are frequencies (counts).

Example Table: Frequencies for Food Type and Pesticide Status

Food Type

Pesticide Present

Pesticide Not Present

Total

Organic

29

98

127

Conventional

19,485

7,086

26,571

Total

19,514

7,184

26,698

Key Questions:

  • What is the response variable? Pesticide Status

  • What is the explanatory variable? Food Type

Calculating Proportions and Conditional Proportions

Proportions and conditional proportions are used to summarize the data in contingency tables and to compare groups.

  • Proportion: The fraction of items in a category out of the total number of items.

  • Conditional Proportion: The proportion of items in a category, given a specific value of another variable.

Example Calculations:

  • Proportion of organic foods containing pesticides:

  • Proportion of conventional foods containing pesticides:

  • Proportion of all sampled items containing pesticide residuals:

Conditional proportions allow for comparison between groups, such as organic vs. conventional foods, with respect to pesticide presence.

Visualizing Conditional Proportions

Side-by-side bar charts are commonly used to display conditional proportions for easy comparison of the explanatory variable with respect to the response variable.

  • If there is no association between the variables, the proportions for the response variable categories will be the same for each food type.

Summary

  • The value of the response variable (outcome) depends on the value of an explanatory variable.

  • For two categorical variables, association is summarized using a contingency table and proportions.

  • Comparing conditional proportions helps to identify and describe associations between categorical variables.

Pearson Logo

Study Prep