Confidence Intervals and Hypothesis Testing: Study Notes for Statistics for Business

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 8: Confidence Intervals

Statistical Inference

Statistical inference involves using data from a sample to make estimates or test hypotheses about a population parameter. This process is foundational in business statistics for making data-driven decisions.

Sample Proportions and Means: We use sample data to estimate population characteristics.
Random Sampling: Ensures that every subject in the population has an equal chance of being selected, which is crucial for valid inference.
Uncertainty: Even with random samples, there is always uncertainty in our estimates.

Point Estimate vs Interval Estimate

A point estimate is a single value used as an estimate of a population parameter, while an interval estimate provides a range of values that is likely to contain the parameter.

Point Estimate: Best guess for the parameter (e.g., sample mean or proportion).
Interval Estimate: Range of values, constructed to have a specified probability of containing the true parameter.
Example: If 220 out of 500 voters support a candidate, the sample proportion is .

Properties of a Good Point Estimate

Unbiasedness: The sampling distribution of the estimate is centered at the true parameter.
Efficiency: The estimate has a small standard deviation compared to other estimators.
Example: The sample mean is more efficient than the sample median for normal distributions.

Confidence Interval

A confidence interval (CI) is an interval constructed from sample data so that, with a certain confidence level, it contains the true population parameter.

Structure: point estimate margin of error
Margin of Error: Accounts for sampling variability.
Interpretation: A 95% CI means that if we took 100 random samples and built a CI from each, about 95 of those intervals would contain the true parameter.

Constructing a Confidence Interval for a Proportion

Formula:
Critical Value : Chosen based on the desired confidence level (from the standard normal distribution).
Common values:
- 90% CI:
- 95% CI:
- 99% CI:
Example: For , , 95% CI:
- Standard error:
- CI:

Sample Size Requirements

For normal approximation to be valid:
With small sample sizes, CIs may not be reliable.

Effect of Confidence Level and Sample Size

Increasing confidence level widens the interval.
Increasing sample size narrows the interval (reduces standard error).
Choosing the confidence level should be done a priori (before data analysis).

Agresti-Coull Confidence Intervals for Proportions

For small samples, the Agresti-Coull method provides better coverage than the standard Wald interval.

Adjustment: Add 2 successes and 2 failures to the sample.
Adjusted Proportion:
Adjusted Sample Size:
CI Formula:
Example: 18 out of 20 customers would buy a new ice cream flavor:
- Adjusted:
- CI:

Confidence Interval for a Population Mean

Large Sample Size ():
Small Sample Size (): Use t-distribution:
Degrees of Freedom:
Example (large n): , , , 95% CI:
Example (small n): , , , , (from t-table):

t-Distribution

Used when estimating the mean with small samples and unknown population standard deviation.
Has thicker tails than the normal distribution, allowing for more variability.
As sample size increases, t-distribution approaches the normal distribution.

Degrees of Freedom

For a sample of size , degrees of freedom .
Represents the number of values in the calculation of a statistic that are free to vary.

Summary Table: Effects on Confidence Interval Width

Factor	Effect on CI Width
Increase confidence level	Wider interval
Increase sample size	Narrower interval
Decrease sample size	Wider interval
Decrease confidence level	Narrower interval

Chapter 9: Hypothesis Testing

Introduction to Hypothesis Testing

Hypothesis testing is a formal procedure for testing claims or ideas about a population using sample data. It is a cornerstone of inferential statistics in business decision-making.

Hypothesis: A statement about a population parameter (e.g., mean, proportion).
Null Hypothesis (): The default claim to be tested (e.g., ).
Alternative Hypothesis (): The claim we seek evidence for (e.g., ).

General Steps in Hypothesis Testing

Check Assumptions: Data must be random, sample size large enough, independent observations.
State Hypotheses: Formulate and before analyzing data.
Calculate Test Statistic: Quantifies how far the sample statistic is from the hypothesized value.
Calculate p-value: Probability of observing a test statistic as extreme as, or more extreme than, the observed value under .
State Conclusion: Compare p-value to significance level () and decide whether to reject .

Types of Hypotheses

One-sided test: or
Two-sided test:

Test Statistic for Proportions

Formula:
Interpretation: Number of standard errors the sample proportion is from the hypothesized value.

p-value and Significance Level

p-value: Probability of obtaining a result as extreme as the observed, assuming is true.
Significance Level (): Threshold for rejecting (commonly 0.05, 0.01, 0.10).
Decision Rule:
- If p-value , reject .
- If p-value , fail to reject .

Example: Hypothesis Test for a Proportion

Question: Do more than 50% of USC students regularly study at the library?
Sample: ,
Hypotheses: ,
Test Statistic:
p-value:
Conclusion: Since , reject . There is sufficient evidence that more than 50% of students study at the library.

Two-Sided Tests

For two-sided tests, double the one-sided p-value.
Example: , p-value =

Relationship Between Confidence Intervals and Hypothesis Tests

If the hypothesized value is not in the CI, reject at the corresponding significance level.
If the hypothesized value is in the CI, fail to reject .

Test Statistic for Means (Large Sample or Known )

Formula:

Test Statistic for Means (Small Sample, Unknown )

Formula:
Use t-distribution with .

Summary Table: Hypothesis Test Steps

Step	Description
1. Assumptions	Check randomization, normality, independence
2. Hypotheses	State and
3. Test Statistic	Calculate z or t value
4. p-value	Find probability of observed statistic under
5. Conclusion	Compare p-value to and interpret

Key Points

Never "accept" the null hypothesis; only "fail to reject" it.
Always set significance level and hypotheses before analyzing data (a priori).
Low p-value: evidence against ; high p-value: insufficient evidence against $H_0$.