Chapters 8–12: Hypothesis Testing, Inferences from Two Samples, Correlation & Regression, Goodness-of-Fit, and ANOVA

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 8: Hypothesis Testing

Testing a Population Mean

Hypothesis testing for a population mean involves determining whether sample data provide sufficient evidence to support or refute a claim about the population mean.

Null Hypothesis (H0): A statement that the population mean equals a specific value.
Alternative Hypothesis (H1): A statement that the population mean differs from the value in H0.
Test Statistic: For known population standard deviation, use the z-test; for unknown, use the t-test.
Significance Level (\( \alpha \)): The probability of rejecting H0 when it is true (commonly 0.05).

Formula (t-test):

\( \bar{x} \): sample mean
\( \mu_0 \): hypothesized population mean
\( s \): sample standard deviation
\( n \): sample size

Example: Testing if the average height of a plant species differs from 15 cm using a sample of 30 plants.

Testing a Population Proportion

Used to determine if the proportion of a population with a certain characteristic matches a claimed value.

Test Statistic (z-test):

\( \hat{p} \): sample proportion
\( p_0 \): hypothesized population proportion
\( n \): sample size

Example: Testing if the proportion of defective items in a shipment exceeds 5%.

Testing a Population Standard Deviation or Variance

Used to test claims about the variability of a population.

Test Statistic (Chi-Square):

\( s^2 \): sample variance
\( \sigma_0^2 \): hypothesized population variance
\( n \): sample size

Example: Testing if the variance in exam scores is greater than a specified value.

Chapter 9: Inferences from Two Samples

Difference Between Two Proportions

Used to compare the proportions of two independent populations.

Confidence Interval:

\( \hat{p}_1, \hat{p}_2 \): sample proportions
\( n_1, n_2 \): sample sizes

Example: Comparing the proportion of smokers in two different cities.

Difference Between Two Means (Independent Samples)

Used to test if the means of two independent populations are equal.

Test Statistic (Equal Variances):

Where pooled standard deviation \( s_p \) is:

\( \bar{x}_1, \bar{x}_2 \): sample means
\( s_1, s_2 \): sample standard deviations
\( n_1, n_2 \): sample sizes

Example: Testing if average test scores differ between two schools.

Matched Pairs (Dependent Samples)

Used when samples are paired or matched (e.g., before-and-after measurements).

Test Statistic: Same as one-sample t-test, but applied to the differences.

\( \bar{d} \): mean of the differences
\( s_d \): standard deviation of the differences
\( n \): number of pairs

Example: Comparing blood pressure before and after treatment in the same patients.

F Test for Two Variances

Used to compare the variances of two populations.

Test Statistic:

\( s_1^2 \): variance of sample 1
\( s_2^2 \): variance of sample 2

Example: Testing if the variability in weights differs between two factories.

Chapter 10: Correlation and Regression

Linear Correlation

Measures the strength and direction of the linear relationship between two variables.

Correlation Coefficient (r): Ranges from -1 to 1.

Properties: r = 1 (perfect positive), r = -1 (perfect negative), r = 0 (no linear correlation).

Example: Analyzing the relationship between hours studied and exam scores.

Regression Equations and Predictions

Regression analysis estimates the relationship between variables and allows predictions.

Regression Equation:

\( b_1 = \frac{n\sum xy - \sum x \sum y}{n\sum x^2 - (\sum x)^2} \)
\( b_0 = \bar{y} - b_1 \bar{x} \)

Example: Predicting sales based on advertising expenditure.

Coefficient of Determination (R2)

Indicates the proportion of the variance in the dependent variable explained by the regression model.

Interpretation: R2 = 0.85 means 85% of the variation is explained by the model.

Example: Comparing models to determine which best fits the data.

Chapter 11: Goodness-of-Fit Tests

Chi-Square Goodness-of-Fit Test

Used to determine if a sample matches a population with a specific distribution.

Test Statistic:

\( O_i \): observed frequency
\( E_i \): expected frequency

Example: Testing if a die is fair based on observed roll frequencies.

Chapter 12: Analysis of Variance (ANOVA)

One-Way ANOVA

Used to test if three or more population means are equal.

Hypotheses:
- H0: All population means are equal.
- H1: At least one mean is different.
Test Statistic (F):

If F is significantly large, reject H0.

Example: Comparing average yields of three different crop varieties.

Summary Table: Key Tests and Their Purposes

Test	Purpose	Test Statistic
t-test	Test mean (one or two samples)	t
z-test	Test proportion	z
Chi-Square	Test variance or distribution fit	\( \chi^2 \)
F-test	Compare two variances or ANOVA	F
Correlation	Measure linear relationship	r
Regression	Predict values	\( \hat{y} = b_0 + b_1 x \)

Additional info: This guide covers the main objectives and statistical tests from Chapters 8–12, including hypothesis testing, inference from two samples, correlation and regression, goodness-of-fit, and ANOVA. For each test, ensure you understand the assumptions, how to calculate the test statistic, and how to interpret results.