Hypothesis Testing: Concepts, Procedures, and Applications

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Hypothesis Testing: Foundations and Procedures

The Language of Hypothesis Testing

Hypothesis testing is a fundamental inferential statistical procedure used to evaluate claims about population parameters based on sample data. It involves formulating two competing statements: the null hypothesis and the alternative hypothesis.

Null Hypothesis (H0): A statement of no effect, no difference, or status quo. It is assumed true until evidence suggests otherwise.
Alternative Hypothesis (H1 or Ha): The statement for which we seek supporting evidence. It represents a change, effect, or difference.

Hypotheses can be structured in three main ways, depending on the research question:

Two-tailed test: H0: parameter = value; H1: parameter ≠ value
Left-tailed test: H0: parameter = value; H1: parameter < value
Right-tailed test: H0: parameter = value; H1: parameter > value

Table of hypothesis structures for population proportion

Additional info: The choice of test direction depends on the research hypothesis (e.g., 'different', 'less than', or 'greater than').

Errors in Hypothesis Testing: Type I and Type II

When making decisions based on sample data, two types of errors can occur:

Type I Error (α): Rejecting the null hypothesis when it is actually true. The probability of this error is denoted by α, the level of significance.
Type II Error (β): Failing to reject the null hypothesis when the alternative hypothesis is true. The probability of this error is denoted by β.

There is a trade-off between α and β: decreasing one typically increases the other.

Drawing Conclusions

After performing a hypothesis test, conclusions are stated in terms of rejecting or not rejecting the null hypothesis. We never "accept" the null hypothesis; instead, we say there is insufficient evidence to reject it.

Testing Hypotheses about a Population Proportion

Logic of Hypothesis Testing for Proportions

To test claims about a population proportion, we use the sampling distribution of the sample proportion (\( \hat{p} \)). If the sample size is large enough, the distribution of \( \hat{p} \) is approximately normal:

Mean: \( \mu_{\hat{p}} = p \)
Standard deviation: \( \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} \)

Sampling distribution of the sample proportion

Statistical significance is determined by how unlikely the observed sample proportion is under the null hypothesis.

Classical Approach: Critical Value Method

We compare the test statistic to critical values to decide whether to reject H0. For a two-tailed test, the critical regions are in both tails of the normal distribution.

Critical regions for two-tailed test Critical region for left-tailed test Critical region for right-tailed test

The test statistic for proportions is:

We reject H0 if the test statistic falls in the critical region.

Decision rules for hypothesis tests

P-Value Approach

The P-value is the probability, under H0, of obtaining a result as extreme or more extreme than the observed sample statistic. For a two-tailed test, the P-value is the sum of the areas in both tails beyond the observed value.

P-value for two-tailed test P-value for left-tailed test P-value for right-tailed test

If the P-value is less than the significance level α, we reject H0.

Example: Large Sample Proportion Test

Suppose in a sample of 1010 adults, 525 do not trust the media. Test if the proportion has increased from 0.46 (1997) at α = 0.05.

H0: p = 0.46
H1: p > 0.46
Sample proportion: \( \hat{p} = 0.52 \)
Test statistic:
Critical value (right-tailed, α = 0.05): 1.645
Since 3.83 > 1.645, reject H0.

Conclusion: There is sufficient evidence to conclude the proportion has increased.

Small Sample Proportion Test: Binomial Approach

If the sample size is small and np(1-p) < 10, use the binomial probability distribution to compute the P-value directly.

Testing Hypotheses about a Population Mean

t-Distribution and Its Properties

When the population standard deviation is unknown, the t-distribution is used for hypothesis testing about means. The t-distribution:

Is symmetric and centered at 0
Has heavier tails than the normal distribution (more variability)
Approaches the normal distribution as sample size increases

Formulating Hypotheses for Means

Table of hypothesis structures for population mean

Test Statistic for Means

The test statistic is:

where \( \bar{x} \) is the sample mean, \( \mu_0 \) is the hypothesized mean, s is the sample standard deviation, and n is the sample size.

Critical Value and P-Value Approaches

Critical regions for two-tailed t-test Critical region for right-tailed t-test Critical region for left-tailed t-test P-value for two-tailed t-test P-value for left-tailed t-test P-value for right-tailed t-test

Example: Small Sample Mean Test

Suppose a researcher weighs 18 state quarters and finds a sample mean of 5.7022 grams (s = 0.0497). Test if the mean differs from 5.67 grams at α = 0.05.

H0: μ = 5.67
H1: μ ≠ 5.67
t0 = (5.7022 – 5.67) / (0.0497 / √18) = 2.75
Critical values (df = 17): ±2.11
Since 2.75 > 2.11, reject H0.

Conclusion: There is sufficient evidence to conclude the mean weight differs from 5.67 grams.

Boxplot of state quarters weights

Statistical vs. Practical Significance

Statistical significance means the observed effect is unlikely under H0. Practical significance considers whether the effect size is large enough to be meaningful in context. Large samples can yield statistically significant but practically unimportant results.

Testing Hypotheses about a Population Standard Deviation

Chi-Square Distribution

The chi-square distribution is used to test hypotheses about population variance or standard deviation. It is not symmetric and depends on degrees of freedom (n – 1).

Chi-square distributions with different degrees of freedom

Formulating Hypotheses for Standard Deviation

Table of hypothesis structures for population standard deviation

Test Statistic for Variance/Standard Deviation

The test statistic is:

where s2 is the sample variance and σ02 is the hypothesized variance.

Critical Value and P-Value Approaches

Critical regions for two-tailed chi-square test Critical region for left-tailed chi-square test Critical region for right-tailed chi-square test Decision rules for chi-square tests P-value for left-tailed chi-square test P-value for right-tailed chi-square test

Choosing the Appropriate Hypothesis Test

The choice of test depends on the parameter of interest and the sample size:

Proportion (p): Use z-test if np(1-p) ≥ 10; otherwise, use binomial test.
Mean (μ): Use t-test if σ unknown and n > 30 or population is normal; use z-test if σ known.
Standard deviation (σ) or variance (σ2): Use chi-square test if population is normal.

Flowchart for choosing hypothesis test

Type II Error and Power of a Test

Probability of Type II Error (β)

Type II error occurs when we fail to reject H0 even though H1 is true. The probability β depends on the true value of the parameter, sample size, and significance level.

Critical region for Type II error Type II error region under alternative hypothesis

Power of a Test

The power of a test is the probability of correctly rejecting H0 when H1 is true. It is calculated as 1 – β. Higher power means a greater chance of detecting a true effect.