BackElementary Statistics Exam 1 Study Guide: Chapters 1–4 (Triola)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 1 – Introduction to Statistics
Big Ideas and Key Concepts
Statistics is the science of collecting, organizing, summarizing, and interpreting data. Understanding the distinction between populations and samples, as well as parameters and statistics, is fundamental. Study design, including awareness of bias and confounding, is crucial for valid conclusions.
Population: The entire group being studied.
Sample: A subset of the population used to draw conclusions.
Parameter: A numerical measurement describing a population (often denoted by Greek letters such as μ, σ, p).
Statistic: A numerical measurement describing a sample (often denoted by x̄, s, p̂).
Variables: Can be quantitative (numerical) or qualitative (categorical).
Discrete data: Countable values (e.g., number of students).
Continuous data: Any value within an interval (e.g., time, weight).
Binary data: Exactly two outcomes (e.g., yes/no).
Levels of Measurement
Nominal: Names/labels; no order. Example: Eye color.
Ordinal: Categories with order, but differences aren’t measurable. Example: Class rank.
Interval: Numerical; equal spacing; no true zero. Example: Temperature in °C.
Ratio: Numerical; equal spacing; true zero; ratios make sense. Example: Height, weight.
Ethics, Bias, and Stakeholder Influence
Stakeholders: May influence study design or interpretation.
Bias: Systematic error that makes results inaccurate (e.g., bad sampling, leading questions).
Sampling bias: Sample is not representative due to undercoverage or nonresponse.
Confounding variable: Related to both explanatory and response variables, distorting relationships.
Statistical vs Practical Significance
Statistical significance: Indicates whether a result is unlikely due to chance (often using α = 0.05).
Practical significance: Considers whether the effect size is meaningful in real life.
Example: Large samples can yield statistically significant but practically unimportant results.
Chapter 2 – Exploring Data: Tables and Graphs
Frequency Distributions
Frequency distributions organize data into classes, showing how many values fall into each class. Key terms include class limits, class width, class midpoint, and class boundaries.
Frequency: Number of values in a class.
Class limits: Smallest and largest values allowed in a class.
Class width: Difference between consecutive lower class limits.
Class midpoint:
Class boundaries: Numbers separating classes with no gaps (e.g., 0.5 below/above class limits).
Relative frequency:
Cumulative frequency: Running total of frequencies up to a class.
Step-by-Step: Constructing a Grouped Frequency Distribution
Find minimum and maximum values.
Decide number of classes (usually 5–10).
Compute class width: , round up.
Choose first lower class limit.
Create classes by adding class width each time.
Count frequencies for each class.
Optional: Add relative and cumulative frequencies.
Example: Grouped Frequency Distribution
Data (minutes): 4, 10, 12, 15, 15, 20, 25, 30, 35, 45, 60, 80. Class width = 15, starting at 0.
Class | Frequency |
|---|---|
0–14 | 3 |
15–29 | 4 |
30–44 | 2 |
45–59 | 1 |
60–74 | 1 |
75–89 | 1 |
Graphs and Their Uses
Histogram: Quantitative data; bars touch; shows shape and outliers.
Frequency polygon: Line graph connecting class midpoints.
Ogive: Graph of cumulative frequency.
Dotplot: Shows each value; good for small datasets.
Stem-and-leaf plot: Displays distribution and exact values.
Pareto chart: Categorical data; bars ordered high to low.
Pie chart: Shows proportions; best with few categories.
Time series graph: Data over time; shows trends.
Scatterplot: Two quantitative variables; shows association.
Scatterplots and Correlation
Positive association: As x increases, y increases.
Negative association: As x increases, y decreases.
Linear correlation coefficient (r): Measures strength and direction; .
Statistical significance: Use critical value table or p-value; if is large enough, correlation is significant.
Line of best fit:
Describing Distribution Shape
Normal (bell-shaped): Symmetric, mound-shaped.
Uniform: Frequencies equal across classes.
Skewed right: Long tail to the right (mean > median).
Skewed left: Long tail to the left (mean < median).
Outliers: Values far from the rest; check with z-scores or boxplots.
Misleading Graphs
Non-zero axis exaggerates differences.
Changing scale or intervals can hide/exaggerate trends.
3-D effects distort perception.
Cherry-picking time windows changes the story.
Chapter 3 – Descriptive Statistics
Measures of Center
Measures of center summarize the typical value in a dataset. The mean is sensitive to outliers, while the median is resistant.
Mean (sample):
Mean (population):
Median: Middle value when data are sorted.
Mode: Most frequent value; can be unimodal, bimodal, or none.
Midrange:
Example: Mean, Median, Mode, Midrange
Data: 5, 7, 7, 10, 12
Mean:
Median: 7 (middle value)
Mode: 7
Midrange:
Measures of Variation
Range:
Sample variance:
Sample standard deviation:
Population variance:
Population standard deviation:
Example: Sample Standard Deviation
Data: 2, 4, 4, 4, 5
Mean:
Deviations: -1.8, 0.2, 0.2, 0.2, 1.2
Squares: 3.24, 0.04, 0.04, 0.04, 1.44; sum = 4.80
Coefficient of Variation (CV)
CV:
Used to compare variation between datasets with different units or means.
z-Scores and Unusual Values
z-score: or
Indicates how many standard deviations x is from the mean.
Rule of thumb: is unusual; is very unusual.
Empirical Rule: For normal distributions: ~68% within 1σ, ~95% within 2σ, ~99.7% within 3σ.
Percentiles and Quartiles
Percentile: Separates lowest k% of data from the rest.
Quartiles: Q1 = 25th, Q2 = 50th (median), Q3 = 75th percentile.
Step-by-Step: Percentile Rank
Sort data.
Count values less than x (L).
Count values equal to x (E).
Percentile rank:
Example: Percentile Rank
Data: 10, 12, 12, 15, 18, 20, 20, 22, 30, 35. x = 20.
L = 5, E = 2, n = 10
Percentile rank:
Step-by-Step: k-th Percentile Value
Sort data.
Compute locator:
If L is whole, average L-th and (L+1)-th values; else, round up to next integer and use that position.
Example: 75th Percentile
Data: 10, 12, 12, 15, 18, 20, 20, 22, 30, 35. 75th percentile.
L = → round up to 8th position.
8th value is 22; P75 = 22.
Five-Number Summary and Boxplots
Five-number summary: min, Q1, median (Q2), Q3, max.
IQR:
Outlier fences: Lower fence = ; Upper fence =
Boxplot: Box from Q1 to Q3, line at median, whiskers to most extreme non-outlier values.
Chapter 4 – Probability
Probability Basics
Probability quantifies the likelihood of an event. The sample space is the set of all possible outcomes.
Probability of event A: (for equally likely outcomes)
Sample space: All possible outcomes.
Complement Rule
Complement:
"At least one" often uses complement:
Addition Rule
General addition rule:
Mutually exclusive: If A and B cannot happen together:
for mutually exclusive events.
Conditional Probability and Independence
Conditional probability: (if )
Multiplication rule:
Independent events:
Independent multiplication:
Step-by-Step: "At Least One" with Independent Trials
Identify single-trial success probability p.
Compute failure probability q = 1 − p.
For n trials:
Example: At Least One Success
Free throw success probability p = 0.30, n = 5 shots.
q = 0.70
Counting Methods
Multiplication rule: If one action can occur in m ways and another in n ways, total ways = m·n
Factorial: ,
Permutations: (order matters)
Combinations: (order does not matter)
Check that
Diagnostic/Contingency Tables
Read row, column, and grand totals.
Convert counts to probabilities by dividing by grand total.
Conditional probabilities: restrict to the 'given' group.
Example:
Quick Problem Routines
Mean: Add values , divide by n.
Median: Sort, pick middle (or average two middles).
Std dev (sample): Mean → deviations → square → sum → divide by (n−1) → sqrt.
CV: Compute s and x̄, then .
z-score: .
Grouped frequency table: Choose class width, build classes, tally frequencies, (optional) RF/CF.
At least one (independent): .
Union probability: Use addition rule: .
Conditional probability: .
Perm vs comb: Order matters? yes→; no→.
Symbol Cheat Sheet
Symbol | Meaning |
|---|---|
n | Sample size |
N | Population size |
x | A single data value |
Σx | Sum of all data values |
x̄ | Sample mean |
μ | Population mean |
s | Sample standard deviation |
s² | Sample variance |
σ | Population standard deviation |
σ² | Population variance |
min, max | Smallest and largest data values |
R | Range = max − min |
MR | Midrange = (min + max)/2 |
z | z-score = (x − mean)/std dev |
Q1, Q2, Q3 | Quartiles (25th, 50th, 75th percentiles) |
IQR | Interquartile range = Q3 − Q1 |
p | Probability of success |
p̂ | Sample proportion |
P(A) | Probability of event A |
Aᶜ | Complement of A (not A) |
A ∪ B | Union: A or B (or both) |
A ∩ B | Intersection: A and B |
P(A|B) | Conditional probability of A given B |
! | Factorial |
P(n,r) | Permutations |
C(n,r) | Combinations |
Common Wording and Their Statistical Meaning
"at least one":
"exactly one": Count specific cases (often in binomial)
"A or B": Union; use addition rule
"A and B": Intersection; use multiplication rule
"given": Conditional probability; restrict to the given group
"random": Every member has equal chance of selection
"unbiased": No systematic error; method doesn’t favor outcomes