MTH161: Review of Statistics I – Core Concepts and Applications

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Statistics and Critical Thinking

Sampling Variability and Systematic Bias

Understanding the sources of error and bias in data collection is essential for drawing valid statistical conclusions. Sampling variability refers to the natural differences that arise when different samples are drawn from the same population. Systematic bias occurs when the sampling process consistently favors certain outcomes.

Sampling Variability: The tendency for sample statistics to differ from one sample to another.
Systematic Bias: Consistent, repeatable error associated with faulty sampling methods.

Examples of Bias

Incorrect Conclusions: Drawing conclusions not supported by data.
Natural Bias: Bias inherent in the population or process.
Nonresponse Bias: When certain groups are less likely to respond to surveys.
Interviewer Bias: Influence of the interviewer on responses.
Volunteer Bias: When participants self-select into the sample.

Example: High school students asked to report on how often they consume alcoholic drinks may underreport due to social desirability, introducing bias.

Parameters and Statistics

Definitions and Distinctions

It is important to distinguish between parameters and statistics:

Parameter: A numerical summary of a population (e.g., population mean μ).
Statistic: A numerical summary of a sample (e.g., sample mean x̄).

Example: If 98% of students sampled think MCC should have a spring break, this is a statistic. If the mean salary of all teachers is $52,134, this is a parameter.

The Standard Normal Distribution

Key Concepts and Properties

The standard normal distribution (SND) is a special case of the normal distribution with mean 0 and standard deviation 1. It is used to standardize scores and calculate probabilities.

All normal distributions can be converted to the SND using z-scores.
The SND is symmetric about the mean (0).
The total area under the curve is 1.
There is a direct correspondence between area and probability.

Formula for z-score:

Example: The area between -1 and 1 standard deviations from the mean is approximately 68%.

Using the Standard Normal Table

To find probabilities for normal distributions, use the z-table (Table A-2):

Find the area to the left of a given z-score.
For the area to the right, subtract the left area from 1.

z	0.00	0.01	0.02
-1.0	0.1587	0.1579	0.1572
0.0	0.5000	0.5040	0.5080
1.0	0.8413	0.8438	0.8461
2.0	0.9772	0.9778	0.9783

Additional info: Table values are cumulative probabilities from the left.

Estimating a Population Proportion

Binomial Probability of Success

Estimating a population proportion involves using sample data to infer the proportion of successes in the population. The binomial distribution is often used when there are two possible outcomes (success/failure).

Sample observations must be a simple random sample.
Sample size should be large enough for normal approximation: and .
= population proportion, .

Point Estimate and Confidence Interval

Point Estimate: The sample proportion is the best estimate of the population proportion .
Confidence Interval: Range of values likely to contain the population proportion.
Confidence Level: The probability that the interval contains the true parameter.

Margin of Error Formula:

Procedure to Construct a Confidence Interval for :

Verify requirements (random sample, , ).
Find the critical value .
Calculate the point estimate and margin of error .
Construct the confidence interval: .

Determining Sample Size

Sample Size for Estimating Proportions

To achieve a desired margin of error at a given confidence level, the required sample size is:

Additional info: If no prior estimate for , use for maximum variability.

Basics of Hypothesis Testing

Null and Alternative Hypotheses

Null Hypothesis (): The statement being tested, usually a statement of no effect or status quo.
Alternative Hypothesis (): The statement we are trying to find evidence for.

Test Statistic for Proportions:

Decisions are based on the p-value or comparison to critical values.

Types of Errors

Type I Error (α): Rejecting when it is true.
Type II Error (β): Failing to reject when it is false.

Power of a Test: The probability of correctly rejecting a false null hypothesis (1 - β).

Calculating Beta and Sample Size for Hypothesis Tests

Beta (Type II Error Probability)

Beta is the probability of failing to reject the null hypothesis when the alternative is true. Calculating beta involves:

Finding the critical region for the test statistic under .
Calculating the probability under the alternative hypothesis that the test statistic falls in the non-rejection region.

Sample Size for Hypothesis Tests: The sample size needed to achieve a desired power (1 - β) at a given significance level can be found using statistical software or formulas.

Summary Table: Key Formulas

Concept	Formula
z-score
Margin of Error
Confidence Interval
Sample Size
Test Statistic (Proportion)

Applications and Examples

Using StatCrunch or statistical tables to find probabilities, confidence intervals, and perform hypothesis tests.
Interpreting results in the context of real-world scenarios (e.g., survey data, product testing, public opinion).

Additional info: These notes cover core concepts from Ch. 1, 2, 6, 7, 8, and 17 of a typical introductory statistics course, focusing on data collection, normal distributions, estimation, and hypothesis testing for proportions.