BackStatistics Exam 2 Study Guide: Variability, Relative Standing, Probability, and Probability Distributions
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Variability
Range Rule of Thumb
The Range Rule of Thumb is a simple method for identifying whether a data value is unusual. It uses the range (difference between the maximum and minimum values) to estimate the spread of data.
Rule: Most values lie within two standard deviations of the mean.
Unusual values: Values more than two standard deviations from the mean are considered unusual.
Formula:
Example: If the mean test score is 70 and the standard deviation is 5, then scores below 60 or above 80 are considered unusual.
Coefficient of Variation (CV)
The Coefficient of Variation is a standardized measure of dispersion of a probability distribution or frequency distribution. It is useful for comparing the degree of variation from one data series to another, even if the means are drastically different.
Definition: The ratio of the standard deviation to the mean, expressed as a percentage.
Formula:
Application: Compare variability between two or more samples with different units or means.
Example: If Sample A has a mean of 50 and standard deviation of 5 (CV = 10%), and Sample B has a mean of 100 and standard deviation of 20 (CV = 20%), Sample B is more variable relative to its mean.
Measures of Relative Standing
Z-Score
The z-score indicates how many standard deviations a data value is from the mean. It is used to compare values from different distributions.
Formula: (for population), (for sample)
Interpretation: A z-score above 2 or below -2 is often considered unusual.
Application: Compare relative positions of data points from different populations.
Example: A test score of 85 with a mean of 75 and standard deviation of 5 has a z-score of (unusual high value).
Percentiles and Five-Number Summary
Percentile: The value below which a given percentage of observations fall.
Finding Percentiles: To find the percentile of a value, count the number of values below it, divide by the total number, and multiply by 100.
Finding a Value for a Given Percentile (Pk): Arrange data in order, compute , where L is the location in the ordered list.
Five-Number Summary: Minimum, Q1 (25th percentile), Median (Q2, 50th percentile), Q3 (75th percentile), Maximum.
Example: For the data set [2, 4, 7, 10, 12], the five-number summary is: Min = 2, Q1 = 4, Median = 7, Q3 = 10, Max = 12.
Interquartile Range (IQR) and Outliers
Interquartile Range (IQR):
Outlier Detection: A value is an outlier if it is below or above .
Example: If Q1 = 20, Q3 = 40, then IQR = 20. Outliers are below or above .
Boxplots and Modified Boxplots
Boxplot: A graphical summary of the five-number summary.
Modified Boxplot: Shows outliers as individual points.
Side-by-Side Boxplots: Used to compare distributions between two samples.
Example: Comparing test scores of two classes using side-by-side boxplots can reveal differences in medians, spreads, and outliers.
Basics of Probability
Probability Concepts
Rare vs. Common Events: An event with a very low probability (typically less than 5%) is considered rare.
Relative Frequency Probability:
Classical Probability:
Nearly Certain/Impossible Events: Probability close to 1 (certain), close to 0 (impossible).
Complement: The probability that event A does not occur:
Example: If the probability of rain is 0.2, the probability of no rain is 0.8.
Compound Events: "Or" and the Addition Property
Addition Rule for Probability
Event "A or B": The event that either A occurs, B occurs, or both occur.
Disjoint (Mutually Exclusive) Events: Events that cannot occur together.
Addition Rule:
For Disjoint Events:
Example: Probability of drawing a heart or a king from a deck: , , ; so .
Compound Events: "And" and the Multiplication Property
Multiplication Rule for Probability
Event "A and B": Both events A and B occur.
Independent Events: The occurrence of one does not affect the probability of the other.
Dependent Events: The occurrence of one affects the probability of the other.
Conditional Probability: is the probability that B occurs given A has occurred.
Sampling with Replacement: Each selection is independent.
Sampling without Replacement: Selections are dependent unless the sample size is less than 5% of the population (5% guideline).
Redundancy: Using multiple components to reduce the probability of system failure.
Example: Probability of drawing two aces in a row without replacement from a deck: , , so .
Complements and Conditional Probabilities
Probability of "At Least Once"
Formula:
Conditional Probability:
Interpretation: The probability of B occurring given that A has occurred.
Distinguishing P(B|A) and P(A|B): The order matters; they answer different questions.
Example: If the probability of no defective items in a sample of 3 is 0.729, then the probability of at least one defective is .
Probability Distributions
Probability Distribution Table
Valid Probability Distribution: All probabilities are between 0 and 1, and the sum is 1.
Mean (Expected Value):
Variance:
Standard Deviation:
Significant Values: Use the Range Rule of Thumb to determine if a value is significantly high or low.
Example Table:
x | P(x) |
|---|---|
0 | 0.2 |
1 | 0.5 |
2 | 0.3 |
Check: (valid distribution). Mean:
Significantly High/Low Values: A value is significantly high if ; significantly low if .
Additional info: For all calculations, technology (such as calculators or statistical software) may be used to compute quartiles, five-number summaries, and probabilities as appropriate.