BackMTH161: Review of Statistics I – Core Concepts and Applications
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Statistics and Critical Thinking
Sampling Variability and Systematic Bias
Understanding the sources of error and bias in data collection is essential for drawing valid statistical conclusions. Sampling variability refers to the natural differences that arise when different samples are drawn from the same population. Systematic bias occurs when the sampling process consistently favors certain outcomes.
Sampling Variability: The tendency for sample statistics to differ from one sample to another.
Systematic Bias: Consistent, repeatable error associated with faulty sampling methods.
Examples of Bias
Incorrect Conclusions: Drawing conclusions not supported by data.
Natural Bias: Bias inherent in the population or process.
Nonresponse Bias: When certain groups are less likely to respond to surveys.
Interviewer Bias: Influence of the interviewer on responses.
Volunteer Bias: When participants self-select into the sample.
Example: High school students asked to report on how often they consume alcoholic drinks may underreport due to social desirability, introducing bias.
Parameters and Statistics
Definitions and Distinctions
It is important to distinguish between parameters and statistics:
Parameter: A numerical summary of a population (e.g., population mean μ).
Statistic: A numerical summary of a sample (e.g., sample mean x̄).
Example: If 98% of students sampled think MCC should have a spring break, this is a statistic. If the mean salary of all teachers is $52,134, this is a parameter.
The Standard Normal Distribution
Key Concepts and Properties
The standard normal distribution (SND) is a special case of the normal distribution with mean 0 and standard deviation 1. It is used to standardize scores and calculate probabilities.
All normal distributions can be converted to the SND using z-scores.
The SND is symmetric about the mean (0).
The total area under the curve is 1.
There is a direct correspondence between area and probability.
Formula for z-score:
Example: The area between -1 and 1 standard deviations from the mean is approximately 68%.
Using the Standard Normal Table
To find probabilities for normal distributions, use the z-table (Table A-2):
Find the area to the left of a given z-score.
For the area to the right, subtract the left area from 1.
z | 0.00 | 0.01 | 0.02 |
|---|---|---|---|
-1.0 | 0.1587 | 0.1579 | 0.1572 |
0.0 | 0.5000 | 0.5040 | 0.5080 |
1.0 | 0.8413 | 0.8438 | 0.8461 |
2.0 | 0.9772 | 0.9778 | 0.9783 |
Additional info: Table values are cumulative probabilities from the left.
Estimating a Population Proportion
Binomial Probability of Success
Estimating a population proportion involves using sample data to infer the proportion of successes in the population. The binomial distribution is often used when there are two possible outcomes (success/failure).
Sample observations must be a simple random sample.
Sample size should be large enough for normal approximation: and .
= population proportion, .
Point Estimate and Confidence Interval
Point Estimate: The sample proportion is the best estimate of the population proportion .
Confidence Interval: Range of values likely to contain the population proportion.
Confidence Level: The probability that the interval contains the true parameter.
Margin of Error Formula:
Procedure to Construct a Confidence Interval for :
Verify requirements (random sample, , ).
Find the critical value .
Calculate the point estimate and margin of error .
Construct the confidence interval: .
Determining Sample Size
Sample Size for Estimating Proportions
To achieve a desired margin of error at a given confidence level, the required sample size is:
Additional info: If no prior estimate for , use for maximum variability.
Basics of Hypothesis Testing
Null and Alternative Hypotheses
Null Hypothesis (): The statement being tested, usually a statement of no effect or status quo.
Alternative Hypothesis (): The statement we are trying to find evidence for.
Test Statistic for Proportions:
Decisions are based on the p-value or comparison to critical values.
Types of Errors
Type I Error (α): Rejecting when it is true.
Type II Error (β): Failing to reject when it is false.
Power of a Test: The probability of correctly rejecting a false null hypothesis (1 - β).
Calculating Beta and Sample Size for Hypothesis Tests
Beta (Type II Error Probability)
Beta is the probability of failing to reject the null hypothesis when the alternative is true. Calculating beta involves:
Finding the critical region for the test statistic under .
Calculating the probability under the alternative hypothesis that the test statistic falls in the non-rejection region.
Sample Size for Hypothesis Tests: The sample size needed to achieve a desired power (1 - β) at a given significance level can be found using statistical software or formulas.
Summary Table: Key Formulas
Concept | Formula |
|---|---|
z-score | |
Margin of Error | |
Confidence Interval | |
Sample Size | |
Test Statistic (Proportion) |
Applications and Examples
Using StatCrunch or statistical tables to find probabilities, confidence intervals, and perform hypothesis tests.
Interpreting results in the context of real-world scenarios (e.g., survey data, product testing, public opinion).
Additional info: These notes cover core concepts from Ch. 1, 2, 6, 7, 8, and 17 of a typical introductory statistics course, focusing on data collection, normal distributions, estimation, and hypothesis testing for proportions.