BackProbability, Distributions, and Confidence Intervals: Study Notes for Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Probability Concepts and Applications
Random Number Tables and Empirical Probability
Random number tables are used to simulate random events, such as guessing answers on a multiple-choice test. Probability can be determined theoretically or empirically.
Theoretical Probability: Calculated based on known possible outcomes. For a question with 4 choices, the probability of guessing correctly is .
Empirical Probability: Determined by conducting an experiment or simulation and observing the frequency of outcomes.
Example: Using a random number table to simulate guessing on a test, count the number of correct guesses and divide by the total number of questions to estimate empirical probability.
Probability with Two-Way Tables
Two-way tables summarize categorical data and allow calculation of probabilities for combined events.
Key Terms: Marginal totals, joint probability, conditional probability.
Example Table:
Male | Female | Total | |
|---|---|---|---|
Undergraduate | 1532 | 1485 | 3017 |
Graduate | 458 | 512 | 970 |
Total | 1990 | 1997 | 3987 |
P(Male):
P(Graduate):
P(Male \text{ and } Graduate):
P(Male \text{ or } Graduate):
Probability with Dice
Calculating Probabilities for Sums and Events
When rolling two dice, each die has 6 faces, resulting in 36 possible outcomes. Probabilities can be calculated for specific sums or combinations.
P(sum = 5): Number of outcomes where the sum is 5 divided by 36.
P(sum is 5 or sum is 6): Add the probabilities for each sum.
P(sum is 7 and one die is 3): Count outcomes where one die is 3 and the sum is 7.
P(sum is 5 and one die is 6): Count outcomes where one die is 6 and the sum is 5.
Mutually Exclusive Events: Events that cannot occur simultaneously (e.g., sum is 5 and sum is 6).
Empirical Probability with Dice Simulations
Empirical probability is estimated by simulating dice rolls and observing frequencies.
Example: Simulate 10 or 36 rolls and calculate the proportion of times a sum of 12 occurs.
Theoretical Probability: For sum of 12, only (6,6) is possible: .
Probability Distribution for a Game
A probability distribution lists all possible outcomes of a random experiment and their probabilities.
Example Game: Roll two dice. If the sum is less than 4, win 10 points. If the sum is 4, 5, or 6, win 3 points. If the sum is 7, lose 2 points. If the sum is greater than 7, lose 1 point.
Probability Distribution Table:
Sum | Points | Probability |
|---|---|---|
2, 3 | 10 | |
4, 5, 6 | 3 | |
7 | -2 | |
8, 9, 10, 11, 12 | -1 |
Normal Distribution and Z-Scores
Understanding the Normal Distribution
The normal distribution is a continuous probability distribution that is symmetric about the mean. Z-scores measure how many standard deviations an observation is from the mean.
Key Formula:
Probability Calculations: Use standard normal tables or software to find probabilities for ranges of z-scores.
Example: Probability that z < -2.27, or that z is between -0.52 and 0.77.
Applications to Real Data
Normal distribution can be used to model real-world data, such as weights of animals.
Example: Mean weight of baby hippo is 88 pounds, standard deviation is 10 pounds.
Probability Calculations:
Probability weight is between 90 and 110 pounds: Find z-scores for 90 and 110, then use normal table.
Percentile for 80 pounds: Find z-score, then percentile from normal table.
Probability weight is less than 40 pounds: Find z-score for 40, then probability from normal table.
Confidence Intervals
Constructing and Interpreting Confidence Intervals
A confidence interval estimates a population parameter (such as mean or proportion) with a specified level of confidence.
Key Formula for Proportion:
Conditions: Random sample, independence, large sample size (np > 10, n(1-p) > 10).
Interpretation: "We are 95% confident that the true proportion lies within the interval."
Increasing Precision: Increase sample size to decrease margin of error.
Changing Confidence Level: Lower confidence level results in a narrower interval; higher confidence level results in a wider interval.
Example: If you take 100 samples and calculate 95% confidence intervals, about 95 intervals are expected to contain the true parameter.
Comparing Proportions Between Groups
Confidence intervals can be used to compare proportions between two groups, such as satisfaction rates in different states.
Key Formula for Difference in Proportions:
Interpretation: If the confidence interval for the difference does not include 0, there is evidence of a difference between groups.
Example Table:
Group | Sample Size | Proportion Satisfied | Confidence Interval |
|---|---|---|---|
Texas | 324 | 0.62 | (0.548, 0.692) |
Kansas | 300 | 0.75 | (0.697, 0.803) |
Conclusion: If intervals do not overlap, there is a significant difference in satisfaction rates.
Central Limit Theorem (CLT)
Sampling Distributions and CLT
The Central Limit Theorem states that the sampling distribution of the sample mean (or proportion) approaches a normal distribution as the sample size increases, regardless of the population's distribution.
Key Points:
Sample size should be large (n > 30 is a common rule).
Allows use of normal probability methods for inference.
Example: Comparing full-time student proportions between years using confidence intervals.
Summary Table: Probability Types
Type | Definition | Example |
|---|---|---|
Theoretical | Based on known possible outcomes | Probability of rolling a sum of 7 with two dice: |
Empirical | Based on observed data from experiments | Simulate 36 dice rolls, count how many times sum is 7 |
Key Formulas
Probability:
Z-score:
Confidence Interval for Proportion:
Difference in Proportions:
Additional info: Some explanations and tables have been expanded for clarity and completeness based on standard statistics curriculum.