BackComprehensive Study Notes for Introductory Statistics (Chapters 1–7)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 1: Introduction to Statistics
Definition and Scope of Statistics
Statistics is the science of collecting, organizing, analyzing, and interpreting data to gain information and make decisions.
Key processes include: producing data, organizing data, and analyzing data.
Types of Data
Qualitative Data: Non-numeric, categorical data (e.g., marital status, eye color).
Quantitative Data: Numeric data (e.g., age, income, test scores).
Population vs. Sample
Population: The entire group of individuals or items of interest.
Sample: A subset of the population, selected for analysis.
Parameter vs. Statistic
Parameter: A numerical descriptive measure of a population (e.g., population mean μ).
Statistic: A numerical descriptive measure of a sample (e.g., sample mean x̄).
Chapter 2: Descriptive Statistics
Measures of Central Tendency
Mean: The arithmetic average.
Formula:
Median: The middle value when data are ordered.
Mode: The value that appears most frequently.
Weighted Average: Each value is multiplied by its weight, summed, and divided by the total weight.
Formula:
Measures of Variation
Sum of Squares (SS):
Variance:
Population:
Sample:
Standard Deviation:
Population:
Sample:
Empirical Rule and Chebyshev’s Theorem
Empirical Rule: For bell-shaped distributions:
68% within 1 standard deviation
95% within 2 standard deviations
99.7% within 3 standard deviations
Chebyshev’s Theorem: For any distribution, at least of data falls within k standard deviations of the mean (k > 1).
Z-Score
Formula:
Indicates how many standard deviations a value is from the mean.
Chapter 3: Probability
Basic Probability Concepts
Probability (P): A measure of the likelihood that an event will occur, between 0 and 1.
Complement of an Event: The probability that the event does not occur.
Formula:
Rules of Probability
Multiplication Rule (Independent Events):
Multiplication Rule (Dependent Events):
Addition Rule (Mutually Exclusive):
Addition Rule (Not Mutually Exclusive):
Chapter 4: Discrete Probability Distributions
Random Variables
Discrete Random Variable: Takes on countable values (e.g., number of babies).
Continuous Random Variable: Takes on any value in an interval (e.g., height, weight).
Discrete Probability Distribution
Each probability must be between 0 and 1.
The sum of all probabilities must equal 1.
Binomial Distribution
Characteristics:
Fixed number of trials (n)
Each trial has two outcomes: Success (S) or Failure (F)
Trials are independent
Probability of success (p) and failure (q = 1 - p) are constant
Mean:
Standard Deviation:
Using Binomial Tables
Probabilities can be found for phrases like "at most," "at least," "no more than," etc.
Examples:
At most c:
At least c:
Fewer than c:
More than c:
Chapter 5: Normal Probability Distributions
Normal Distribution
Symmetrical, bell-shaped curve
Area under the curve = 1
Mean, median, and mode are all at the center
Standard Normal Distribution
Mean () = 0, Standard deviation () = 1
Standardized values are called z-scores
Converting Between X and Z
To find a raw score given a z-score:
To find a z-score from x:
Central Limit Theorem (CLT)
If the population is normal, the sampling distribution of the sample mean is normal for any sample size n.
If the population is not normal, the sampling distribution is approximately normal if .
Mean of sampling distribution:
Standard error:
Normal Approximation to the Binomial
Can be used if and
Continuity correction: Add or subtract 0.5 to the discrete x value when approximating with the normal distribution.
Summary Table: Statistics vs. Parameters
Sample (Statistic) | Population (Parameter) |
|---|---|
Mean () | Mean () |
Variance () | Variance () |
Standard Deviation () | Standard Deviation () |
Proportion () | Proportion () |
Chapter 6: Confidence Intervals
Confidence Interval for the Mean (σ Known)
Formula:
Assumptions:
Population is normal or
σ is known
Sample is random
Confidence level (c): Area under the curve between and (e.g., 99% = 2.576, 95% = 1.96, 90% = 1.645)
Margin of error:
Interpretation: "We are c% confident that the interval contains the parameter μ."
Sample Size for a Given Margin of Error
Formula:
If n is not a whole number, round up to the next integer.
Confidence Interval for the Mean (σ Unknown)
Use Student’s t-distribution:
Formula:
Degrees of freedom:
t-distribution is bell-shaped, symmetrical, but has thicker tails than the normal distribution.
For df not in the table, use the closest smaller df.
Comparing z and t Distributions
t-distribution is used when σ is unknown and n is small.
z-distribution is used when σ is known or n is large.
For a given confidence level, the t-interval is wider than the z-interval.
Chapter 7: Hypothesis Testing with One Sample
Formulating Hypotheses
Null Hypothesis (H0): Statement about the population parameter (e.g., μ = μ0), represents the status quo.
Alternative Hypothesis (H1): What we want to test (e.g., μ ≠ μ0, μ > μ0, μ < μ0).
H0 and H1 are complements.
Types of Tests
Right-tailed test: H1: μ > μ0
Left-tailed test: H1: μ < μ0
Two-tailed test: H1: μ ≠ μ0
Test Statistics
If σ is known and population is normal or n ≥ 30:
If σ is unknown and population is normal or n ≥ 30:
P-Value Method
P-value: Probability of obtaining a test statistic as extreme as the observed, assuming H0 is true.
For z-tests:
Left-tailed: P(z < calculated z)
Right-tailed: P(z > calculated z)
Two-tailed: 2 × P(z > |calculated z|)
For t-tests: Use t-tables and degrees of freedom to find the interval containing the P-value.
Errors in Hypothesis Testing
Type I Error (α): Rejecting H0 when it is true.
Type II Error (β): Failing to reject H0 when it is false.
Level of significance: Type I error
Never say "accept H0"; instead, "fail to reject H0".
Steps in Hypothesis Testing (P-value Method)
State H0, H1, and α.
Identify the type of test (right, left, two-tailed).
Choose the appropriate test statistic (z or t) and calculate it.
Find the P-value using the appropriate table.
Compare P-value to α:
If P-value < α, reject H0.
If P-value ≥ α, do not reject H0.
Draw a conclusion in context.