BackChi-Square Tests and Nonparametric Statistics: A Study Guide
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Nonparametric Statistics
Introduction to Nonparametric Statistics
Nonparametric statistics are a special class of hypothesis tests used when the assumptions required for parametric tests are not met. These tests are particularly useful when the data are nominal or ordinal, or when the sample size is small and the underlying population distribution is not normal.
Nonparametric tests do not require the dependent variable (DV) to be measured on a scale (interval or ratio).
They help distinguish between patterns and chance in observational data without a scale DV.
Additional info: Parametric tests typically assume normality, homogeneity of variance, and interval/ratio data.
When to Use Nonparametric Tests
When the DV is nominal (categorical, e.g., gender, color).
When the DV is ordinal (ranked, e.g., 1st, 2nd, 3rd).
When the sample size is small and the population is not normal.
Limitations of Nonparametric Tests
Confidence intervals and effect-size measures are not typically available for nominal or ordinal data.
Nonparametric tests have less statistical power than parametric tests, making Type II errors (failing to detect a true effect) more likely.
Additional info: Type I error is rejecting a true null hypothesis; Type II error is failing to reject a false null hypothesis.
Chi-Square Tests
Overview of Chi-Square Tests
Chi-square tests are the most common nonparametric tests for categorical data. They are used to analyze frequencies and proportions in nominal variables.
Chi-square test for goodness of fit: Used with one nominal variable to compare observed frequencies to expected frequencies.
Chi-square test for independence: Used with two nominal variables to determine if there is an association between them.
Chi-Square Test for Goodness of Fit
Steps of Hypothesis Testing
Identify populations, distribution, and assumptions: Always two populations (observed and expected), use chi-square distribution, nominal variable, independent observations, random selection, and minimum expected frequency per cell.
State the hypotheses: Null hypothesis (H0): Observed frequencies match expected frequencies. Alternative hypothesis (HA): Observed frequencies differ from expected frequencies.
Determine characteristics of the comparison distribution: The comparison distribution is the chi-square distribution. Degrees of freedom (df) are calculated as:
Determine critical values: Use the chi-square table to find the critical value for the chosen alpha level and degrees of freedom.
Calculate the test statistic: The chi-square statistic is calculated as:
O = observed frequency
E = expected frequency
Make a decision: Compare the calculated chi-square value to the critical value. If it exceeds the critical value, reject the null hypothesis.
Example Table: Chi-Square Calculations
Category | Observed (O) | Expected (E) | O-E | (O-E)2 | (O-E)2/E |
|---|---|---|---|---|---|
First 3 months | 52 | 28 | 24 | 576 | 20.57 |
Last 3 months | 4 | 28 | -24 | 576 | 20.57 |
Chi-Square Test for Independence
Steps of Hypothesis Testing
Identify populations, distribution, and assumptions: Use chi-square distribution; test is for independence between two nominal variables.
State the hypotheses: Null hypothesis (H0): The two variables are independent. Alternative hypothesis (HA): The two variables are associated.
Determine characteristics of the comparison distribution: Degrees of freedom are calculated as:
Determine critical values: Use the chi-square table for the appropriate alpha level and degrees of freedom.
Calculate the test statistic: Calculate expected frequencies for each cell, then use:
Make a decision: Compare the calculated value to the critical value to accept or reject the null hypothesis.
Example Table: Observed and Expected Frequencies
Observed: Recycling | Observed: Trash | Expected: Recycling | Expected: Trash | |
|---|---|---|---|---|
Correctly spelled name | 25 | 28 | 17.331 | 35.669 |
Incorrectly spelled name | 13 | 40 | 17.331 | 35.669 |
No name | 14 | 39 | 17.331 | 35.669 |
Chi-Square Calculations Table
Category | Observed (O) | Expected (E) | O-E | (O-E)2 | (O-E)2/E |
|---|---|---|---|---|---|
Correctly spelled name; chose recycling | 25 | 17.331 | 7.669 | 58.83 | 3.395 |
Correctly spelled name; chose trash | 28 | 35.669 | -7.669 | 58.83 | 1.649 |
Incorrectly spelled name; chose recycling | 13 | 17.331 | -4.331 | 18.76 | 1.083 |
Incorrectly spelled name; chose trash | 40 | 35.669 | 4.331 | 18.76 | 0.526 |
No name; chose recycling | 14 | 17.331 | -3.331 | 11.09 | 0.640 |
No name; chose trash | 39 | 35.669 | 3.331 | 11.09 | 0.311 |
Effect Size: Cramér's V (Phi)
Cramér's V is used to measure the effect size for chi-square tests for independence.
Effect Size Interpretation Table
Effect Size | When df = 1 | When df = 2 | When df = 3 |
|---|---|---|---|
Small | 0.10 | 0.07 | 0.06 |
Medium | 0.30 | 0.21 | 0.17 |
Large | 0.50 | 0.35 | 0.29 |
Conditional Propositions and Graphing
Conditional Proportions
Conditional proportions show the probability of an outcome given a specific condition. These are useful for interpreting chi-square results.
Conditional Proportions: Recycling | Conditional Proportions: Trash | |
|---|---|---|
Correctly spelled name | 0.472 | 0.528 |
Incorrectly spelled name | 0.245 | 0.755 |
No name | 0.264 | 0.736 |
Graphing Chi-Square Results
Bar graphs are commonly used to display the proportions or frequencies for each group or condition.
Conditional probabilities can be visualized to compare groups directly.
Relative Risk
Definition and Application
Relative risk (or relative likelihood) quantifies the size of an effect in chi-square analysis by comparing the ratio of two conditional proportions.
For example, if one group is three times as likely to show an outcome, the relative risk is 3; if one group is one-third as likely, the relative risk is 1/3.
Additional info: Relative risk is especially useful in epidemiology and behavioral sciences to communicate the practical significance of findings.