BackConfidence Intervals and Hypothesis Testing: Study Notes for Statistics for Business
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 8: Confidence Intervals
Statistical Inference
Statistical inference involves using data from a sample to make estimates or test hypotheses about a population parameter. This process is foundational in business statistics for making data-driven decisions.
Sample Proportions and Means: We use sample data to estimate population characteristics.
Random Sampling: Ensures that every subject in the population has an equal chance of being selected, which is crucial for valid inference.
Uncertainty: Even with random samples, there is always uncertainty in our estimates.
Point Estimate vs Interval Estimate
A point estimate is a single value used as an estimate of a population parameter, while an interval estimate provides a range of values that is likely to contain the parameter.
Point Estimate: Best guess for the parameter (e.g., sample mean or proportion).
Interval Estimate: Range of values, constructed to have a specified probability of containing the true parameter.
Example: If 220 out of 500 voters support a candidate, the sample proportion is .
Properties of a Good Point Estimate
Unbiasedness: The sampling distribution of the estimate is centered at the true parameter.
Efficiency: The estimate has a small standard deviation compared to other estimators.
Example: The sample mean is more efficient than the sample median for normal distributions.
Confidence Interval
A confidence interval (CI) is an interval constructed from sample data so that, with a certain confidence level, it contains the true population parameter.
Structure: point estimate margin of error
Margin of Error: Accounts for sampling variability.
Interpretation: A 95% CI means that if we took 100 random samples and built a CI from each, about 95 of those intervals would contain the true parameter.
Constructing a Confidence Interval for a Proportion
Formula:
Critical Value : Chosen based on the desired confidence level (from the standard normal distribution).
Common values:
90% CI:
95% CI:
99% CI:
Example: For , , 95% CI:
Standard error:
CI:
Sample Size Requirements
For normal approximation to be valid:
With small sample sizes, CIs may not be reliable.
Effect of Confidence Level and Sample Size
Increasing confidence level widens the interval.
Increasing sample size narrows the interval (reduces standard error).
Choosing the confidence level should be done a priori (before data analysis).
Agresti-Coull Confidence Intervals for Proportions
For small samples, the Agresti-Coull method provides better coverage than the standard Wald interval.
Adjustment: Add 2 successes and 2 failures to the sample.
Adjusted Proportion:
Adjusted Sample Size:
CI Formula:
Example: 18 out of 20 customers would buy a new ice cream flavor:
Adjusted:
CI:
Confidence Interval for a Population Mean
Large Sample Size ():
Small Sample Size (): Use t-distribution:
Degrees of Freedom:
Example (large n): , , , 95% CI:
Example (small n): , , , , (from t-table):
t-Distribution
Used when estimating the mean with small samples and unknown population standard deviation.
Has thicker tails than the normal distribution, allowing for more variability.
As sample size increases, t-distribution approaches the normal distribution.
Degrees of Freedom
For a sample of size , degrees of freedom .
Represents the number of values in the calculation of a statistic that are free to vary.
Summary Table: Effects on Confidence Interval Width
Factor | Effect on CI Width |
|---|---|
Increase confidence level | Wider interval |
Increase sample size | Narrower interval |
Decrease sample size | Wider interval |
Decrease confidence level | Narrower interval |
Chapter 9: Hypothesis Testing
Introduction to Hypothesis Testing
Hypothesis testing is a formal procedure for testing claims or ideas about a population using sample data. It is a cornerstone of inferential statistics in business decision-making.
Hypothesis: A statement about a population parameter (e.g., mean, proportion).
Null Hypothesis (): The default claim to be tested (e.g., ).
Alternative Hypothesis (): The claim we seek evidence for (e.g., ).
General Steps in Hypothesis Testing
Check Assumptions: Data must be random, sample size large enough, independent observations.
State Hypotheses: Formulate and before analyzing data.
Calculate Test Statistic: Quantifies how far the sample statistic is from the hypothesized value.
Calculate p-value: Probability of observing a test statistic as extreme as, or more extreme than, the observed value under .
State Conclusion: Compare p-value to significance level () and decide whether to reject .
Types of Hypotheses
One-sided test: or
Two-sided test:
Test Statistic for Proportions
Formula:
Interpretation: Number of standard errors the sample proportion is from the hypothesized value.
p-value and Significance Level
p-value: Probability of obtaining a result as extreme as the observed, assuming is true.
Significance Level (): Threshold for rejecting (commonly 0.05, 0.01, 0.10).
Decision Rule:
If p-value , reject .
If p-value , fail to reject .
Example: Hypothesis Test for a Proportion
Question: Do more than 50% of USC students regularly study at the library?
Sample: ,
Hypotheses: ,
Test Statistic:
p-value:
Conclusion: Since , reject . There is sufficient evidence that more than 50% of students study at the library.
Two-Sided Tests
For two-sided tests, double the one-sided p-value.
Example: , p-value =
Relationship Between Confidence Intervals and Hypothesis Tests
If the hypothesized value is not in the CI, reject at the corresponding significance level.
If the hypothesized value is in the CI, fail to reject .
Test Statistic for Means (Large Sample or Known )
Formula:
Test Statistic for Means (Small Sample, Unknown )
Formula:
Use t-distribution with .
Summary Table: Hypothesis Test Steps
Step | Description |
|---|---|
1. Assumptions | Check randomization, normality, independence |
2. Hypotheses | State and |
3. Test Statistic | Calculate z or t value |
4. p-value | Find probability of observed statistic under |
5. Conclusion | Compare p-value to and interpret |
Key Points
Never "accept" the null hypothesis; only "fail to reject" it.
Always set significance level and hypotheses before analyzing data (a priori).
Low p-value: evidence against ; high p-value: insufficient evidence against $H_0$.