Chi-Square Tests and Nonparametric Statistics: A Study Guide

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Nonparametric Statistics

Introduction to Nonparametric Statistics

Nonparametric statistics are a special class of hypothesis tests used when the assumptions required for parametric tests are not met. These tests are particularly useful when the data are nominal or ordinal, or when the sample size is small and the underlying population distribution is not normal.

Nonparametric tests do not require the dependent variable (DV) to be measured on a scale (interval or ratio).
They help distinguish between patterns and chance in observational data without a scale DV.

Additional info: Parametric tests typically assume normality, homogeneity of variance, and interval/ratio data.

When to Use Nonparametric Tests

When the DV is nominal (categorical, e.g., gender, color).
When the DV is ordinal (ranked, e.g., 1st, 2nd, 3rd).
When the sample size is small and the population is not normal.

Limitations of Nonparametric Tests

Confidence intervals and effect-size measures are not typically available for nominal or ordinal data.
Nonparametric tests have less statistical power than parametric tests, making Type II errors (failing to detect a true effect) more likely.

Additional info: Type I error is rejecting a true null hypothesis; Type II error is failing to reject a false null hypothesis.

Chi-Square Tests

Overview of Chi-Square Tests

Chi-square tests are the most common nonparametric tests for categorical data. They are used to analyze frequencies and proportions in nominal variables.

Chi-square test for goodness of fit: Used with one nominal variable to compare observed frequencies to expected frequencies.
Chi-square test for independence: Used with two nominal variables to determine if there is an association between them.

Chi-Square Test for Goodness of Fit

Steps of Hypothesis Testing

Identify populations, distribution, and assumptions: Always two populations (observed and expected), use chi-square distribution, nominal variable, independent observations, random selection, and minimum expected frequency per cell.
State the hypotheses: Null hypothesis (H0): Observed frequencies match expected frequencies. Alternative hypothesis (HA): Observed frequencies differ from expected frequencies.
Determine characteristics of the comparison distribution: The comparison distribution is the chi-square distribution. Degrees of freedom (df) are calculated as:

Determine critical values: Use the chi-square table to find the critical value for the chosen alpha level and degrees of freedom.
Calculate the test statistic: The chi-square statistic is calculated as:

O = observed frequency
E = expected frequency

Make a decision: Compare the calculated chi-square value to the critical value. If it exceeds the critical value, reject the null hypothesis.

Example Table: Chi-Square Calculations

Category	Observed (O)	Expected (E)	O-E	(O-E)2	(O-E)2/E
First 3 months	52	28	24	576	20.57
Last 3 months	4	28	-24	576	20.57

Chi-Square Test for Independence

Steps of Hypothesis Testing

Identify populations, distribution, and assumptions: Use chi-square distribution; test is for independence between two nominal variables.
State the hypotheses: Null hypothesis (H0): The two variables are independent. Alternative hypothesis (HA): The two variables are associated.
Determine characteristics of the comparison distribution: Degrees of freedom are calculated as:

Determine critical values: Use the chi-square table for the appropriate alpha level and degrees of freedom.
Calculate the test statistic: Calculate expected frequencies for each cell, then use:

Make a decision: Compare the calculated value to the critical value to accept or reject the null hypothesis.

Example Table: Observed and Expected Frequencies

	Observed: Recycling	Observed: Trash	Expected: Recycling	Expected: Trash
Correctly spelled name	25	28	17.331	35.669
Incorrectly spelled name	13	40	17.331	35.669
No name	14	39	17.331	35.669

Chi-Square Calculations Table

Category	Observed (O)	Expected (E)	O-E	(O-E)2	(O-E)2/E
Correctly spelled name; chose recycling	25	17.331	7.669	58.83	3.395
Correctly spelled name; chose trash	28	35.669	-7.669	58.83	1.649
Incorrectly spelled name; chose recycling	13	17.331	-4.331	18.76	1.083
Incorrectly spelled name; chose trash	40	35.669	4.331	18.76	0.526
No name; chose recycling	14	17.331	-3.331	11.09	0.640
No name; chose trash	39	35.669	3.331	11.09	0.311

Effect Size: Cramér's V (Phi)

Cramér's V is used to measure the effect size for chi-square tests for independence.

Effect Size Interpretation Table

Effect Size	When df = 1	When df = 2	When df = 3
Small	0.10	0.07	0.06
Medium	0.30	0.21	0.17
Large	0.50	0.35	0.29

Conditional Propositions and Graphing

Conditional Proportions

Conditional proportions show the probability of an outcome given a specific condition. These are useful for interpreting chi-square results.

	Conditional Proportions: Recycling	Conditional Proportions: Trash
Correctly spelled name	0.472	0.528
Incorrectly spelled name	0.245	0.755
No name	0.264	0.736

Graphing Chi-Square Results

Bar graphs are commonly used to display the proportions or frequencies for each group or condition.
Conditional probabilities can be visualized to compare groups directly.

Relative Risk

Definition and Application

Relative risk (or relative likelihood) quantifies the size of an effect in chi-square analysis by comparing the ratio of two conditional proportions.
For example, if one group is three times as likely to show an outcome, the relative risk is 3; if one group is one-third as likely, the relative risk is 1/3.

Additional info: Relative risk is especially useful in epidemiology and behavioral sciences to communicate the practical significance of findings.