Skip to main content
Back

Statistics Unit 3: Confidence Intervals and Hypothesis Testing (Chapters 9-11) – Study Guide

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Vocabulary and Notation

Key Terms

  • Point Estimate: A single value used to estimate a population parameter (e.g., sample mean \( \bar{x} \) estimates population mean \( \mu \)).

  • Confidence Interval: An interval estimate, calculated from the sample data, that is likely to contain the population parameter with a specified level of confidence.

  • Level of Confidence : The probability that the confidence interval contains the true parameter.

  • Margin of Error: The maximum expected difference between the point estimate and the true parameter value.

  • Critical Value: The value that defines the endpoints of the confidence interval, based on the desired confidence level (e.g., z\( \alpha/2 \) or t\( \alpha/2 \)).

  • Student’s t-Distribution: A probability distribution used when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.

  • Bootstrapping: A resampling method used to estimate the sampling distribution of a statistic by repeatedly sampling with replacement from the observed data.

  • Percentile Method Confidence Interval: A confidence interval constructed from the percentiles of the bootstrap distribution.

  • Hypothesis: A statement about a population parameter. Includes the null hypothesis (\( H_0 \)) and alternative hypothesis (\( H_1 \)).

  • Hypothesis Testing: A statistical method for testing a claim about a population parameter using sample data.

  • Type I Error: Rejecting the null hypothesis when it is true (false positive).

  • Type II Error: Failing to reject the null hypothesis when it is false (false negative).

  • Level of Significance (\( \alpha \)): The probability of making a Type I error.

  • P-value: The probability, under the null hypothesis, of obtaining a result equal to or more extreme than what was actually observed.

  • Statistical Significance: When the observed effect is unlikely to have occurred by chance, as determined by the p-value.

  • Practical Significance: When the observed effect is large enough to be meaningful in real-world terms.

  • Independent Samples: Samples in which the selection of one sample does not influence the selection of the other.

  • Dependent Samples (Matched-Pairs): Samples in which each observation in one sample can be paired with an observation in the other sample.

  • Robust Test: A statistical test that is valid even when certain assumptions are violated.

  • Randomization Test: A nonparametric method for hypothesis testing using random resampling.

One Sample Confidence Intervals

Confidence Interval for One Sample Proportion

Used to estimate the true population proportion based on a sample.

  • Formula:

  • Assumptions:

    • Sample obtained by simple random sampling or randomized experiment.

    • \( n\hat{p}(1-\hat{p}) \geq 10 \)

    • Sampled values are independent (sample size < 5% of population).

  • Example: If \( \hat{p} = 0.6 \), \( n = 100 \), and 95% confidence (\( z_{0.025} = 1.96 \)), the interval is:

t Confidence Interval for Mean

Used when estimating the population mean and the population standard deviation is unknown.

  • Formula:

  • Assumptions:

    • Sample obtained by simple random sampling or randomized experiment.

    • No outliers; population is normal or sample size \( n \geq 30 \).

    • Sampled values are independent.

  • Example: \( \bar{x} = 50 \), \( s = 10 \), \( n = 25 \), 95% confidence (\( t_{0.025,24} \approx 2.064 \)):

One Sample Hypothesis Tests

z Test for One Sample Proportion

Tests whether the population proportion equals a specified value.

  • Test Statistic:

  • Hypotheses:

    • Two-tailed: \( H_0: p = p_0 \), \( H_1: p \neq p_0 \)

    • Left-tailed: \( H_0: p = p_0 \), \( H_1: p < p_0 \)

    • Right-tailed: \( H_0: p = p_0 \), \( H_1: p > p_0 \)

  • Assumptions:

    • Simple random sample or randomized experiment.

    • \( n p_0 (1-p_0) \geq 10 \)

    • Sampled values are independent (sample size < 5% of population).

t Test for Mean

Tests whether the population mean equals a specified value.

  • Test Statistic:

  • Degrees of Freedom: \( df = n - 1 \)

  • Hypotheses:

    • Two-tailed: \( H_0: \mu = \mu_0 \), \( H_1: \mu \neq \mu_0 \)

    • Left-tailed: \( H_0: \mu = \mu_0 \), \( H_1: \mu < \mu_0 \)

    • Right-tailed: \( H_0: \mu = \mu_0 \), \( H_1: \mu > \mu_0 \)

  • Assumptions:

    • Simple random sample or randomized experiment.

    • No outliers; population is normal or \( n \geq 30 \).

    • Sampled values are independent.

Two Sample Hypothesis Tests

Two Sample z Test for Proportions

Compares the proportions of two independent groups.

  • Test Statistic:

where \( \hat{p} = \dfrac{x_1 + x_2}{n_1 + n_2} \)

  • Hypotheses:

    • Two-tailed: \( H_0: p_1 = p_2 \), \( H_1: p_1 \neq p_2 \)

    • Left-tailed: \( H_0: p_1 = p_2 \), \( H_1: p_1 < p_2 \)

    • Right-tailed: \( H_0: p_1 = p_2 \), \( H_1: p_1 > p_2 \)

  • Assumptions:

    • Independent samples from simple random sampling or randomized experiment.

    • \( n\hat{p}(1-\hat{p}) \geq 10 \) for both samples.

    • Sample sizes < 5% of respective populations.

Two Sample t Test for Dependent Means (Matched Pairs)

Compares means from paired or matched samples.

  • Test Statistic:

  • Degrees of Freedom: \( df = n - 1 \)

  • Hypotheses:

    • Two-tailed: \( H_0: \mu_d = 0 \), \( H_1: \mu_d \neq 0 \)

    • Left-tailed: \( H_0: \mu_d = 0 \), \( H_1: \mu_d < 0 \)

    • Right-tailed: \( H_0: \mu_d = 0 \), \( H_1: \mu_d > 0 \)

  • Assumptions: Same as one sample mean, but applied to the differences.

Two Sample t Test for Independent Means (Unequal Variances)

Compares means from two independent samples, not assuming equal variances.

  • Test Statistic:

  • Degrees of Freedom (Welch-Satterthwaite approximation):

  • Hypotheses:

    • Two-tailed: \( H_0: \mu_1 = \mu_2 \), \( H_1: \mu_1 \neq \mu_2 \)

    • Left-tailed: \( H_0: \mu_1 = \mu_2 \), \( H_1: \mu_1 < \mu_2 \)

    • Right-tailed: \( H_0: \mu_1 = \mu_2 \), \( H_1: \mu_1 > \mu_2 \)

  • Assumptions:

    • Independent samples from simple random sampling or randomized experiment.

    • Populations are normal or sample sizes \( n_1, n_2 \geq 30 \).

    • Sample sizes < 5% of respective populations.

Two Sample Confidence Intervals

Confidence Interval for Difference Between Two Proportions

  • Formula:

  • Assumptions:

    • Independent samples from simple random sampling or randomized experiment.

    • \( n_1\hat{p}_1(1-\hat{p}_1) \geq 10 \) and \( n_2\hat{p}_2(1-\hat{p}_2) \geq 10 \)

    • Sample sizes < 5% of respective populations.

Confidence Interval for Mean of Differences (Paired Data)

  • Formula:

  • Degrees of Freedom: \( df = n - 1 \)

  • Assumptions: Same as one sample mean, but applied to the differences.

Confidence Interval for Difference Between Two Independent Means (Unequal Variances)

  • Formula:

  • Degrees of Freedom: (see formula above)

  • Assumptions:

    • Independent samples from simple random sampling or randomized experiment.

    • Populations are normal or sample sizes \( n_1, n_2 \geq 30 \).

    • Sample sizes < 5% of respective populations.

Sample Size Calculations

Sample Size Needed for Proportions

  • With Prior Estimate \( \hat{p} \):

  • Without Prior Estimate:

where E is the desired margin of error (as a decimal).

Sample Size Needed for Means

  • Formula:

where E is the desired margin of error.

Summary Table: Hypothesis Tests and Confidence Intervals

Test/Interval

Parameter

Formula

Assumptions

One-sample z for proportion

p

Random sample, independence, \( n p_0 (1-p_0) \geq 10 \)

One-sample t for mean

\( \mu \)

Random sample, normality or large n, independence

Two-sample z for proportions

\( p_1 - p_2 \)

Random, independent samples, \( n\hat{p}(1-\hat{p}) \geq 10 \)

Two-sample t for means (independent, unequal variances)

\( \mu_1 - \mu_2 \)

Random, independent samples, normality or large n

Paired t for means

\( \mu_d \)

Random sample of pairs, normality or large n

Additional Info

  • Statistical software such as StatCrunch can be used to perform these calculations and simulations.

  • Bootstrapping and randomization tests provide nonparametric alternatives when assumptions are questionable.

Pearson Logo

Study Prep