Probability, Distributions, and Confidence Intervals: Study Notes for Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Probability Concepts and Applications

Random Number Tables and Empirical Probability

Random number tables are used to simulate random events, such as guessing answers on a multiple-choice test. Probability can be determined theoretically or empirically.

Theoretical Probability: Calculated based on known possible outcomes. For a question with 4 choices, the probability of guessing correctly is .
Empirical Probability: Determined by conducting an experiment or simulation and observing the frequency of outcomes.
Example: Using a random number table to simulate guessing on a test, count the number of correct guesses and divide by the total number of questions to estimate empirical probability.

Probability with Two-Way Tables

Two-way tables summarize categorical data and allow calculation of probabilities for combined events.

Key Terms: Marginal totals, joint probability, conditional probability.
Example Table:

	Male	Female	Total
Undergraduate	1532	1485	3017
Graduate	458	512	970
Total	1990	1997	3987

P(Male):
P(Graduate):
P(Male \text{ and } Graduate):
P(Male \text{ or } Graduate):

Probability with Dice

Calculating Probabilities for Sums and Events

When rolling two dice, each die has 6 faces, resulting in 36 possible outcomes. Probabilities can be calculated for specific sums or combinations.

P(sum = 5): Number of outcomes where the sum is 5 divided by 36.
P(sum is 5 or sum is 6): Add the probabilities for each sum.
P(sum is 7 and one die is 3): Count outcomes where one die is 3 and the sum is 7.
P(sum is 5 and one die is 6): Count outcomes where one die is 6 and the sum is 5.
Mutually Exclusive Events: Events that cannot occur simultaneously (e.g., sum is 5 and sum is 6).

Empirical Probability with Dice Simulations

Empirical probability is estimated by simulating dice rolls and observing frequencies.

Example: Simulate 10 or 36 rolls and calculate the proportion of times a sum of 12 occurs.
Theoretical Probability: For sum of 12, only (6,6) is possible: .

Probability Distribution for a Game

A probability distribution lists all possible outcomes of a random experiment and their probabilities.

Example Game: Roll two dice. If the sum is less than 4, win 10 points. If the sum is 4, 5, or 6, win 3 points. If the sum is 7, lose 2 points. If the sum is greater than 7, lose 1 point.
Probability Distribution Table:

Sum	Points	Probability
2, 3	10
4, 5, 6	3
7	-2
8, 9, 10, 11, 12	-1

Normal Distribution and Z-Scores

Understanding the Normal Distribution

The normal distribution is a continuous probability distribution that is symmetric about the mean. Z-scores measure how many standard deviations an observation is from the mean.

Key Formula:
Probability Calculations: Use standard normal tables or software to find probabilities for ranges of z-scores.
Example: Probability that z < -2.27, or that z is between -0.52 and 0.77.

Applications to Real Data

Normal distribution can be used to model real-world data, such as weights of animals.

Example: Mean weight of baby hippo is 88 pounds, standard deviation is 10 pounds.
Probability Calculations:
- Probability weight is between 90 and 110 pounds: Find z-scores for 90 and 110, then use normal table.
- Percentile for 80 pounds: Find z-score, then percentile from normal table.
- Probability weight is less than 40 pounds: Find z-score for 40, then probability from normal table.

Confidence Intervals

Constructing and Interpreting Confidence Intervals

A confidence interval estimates a population parameter (such as mean or proportion) with a specified level of confidence.

Key Formula for Proportion:
Conditions: Random sample, independence, large sample size (np > 10, n(1-p) > 10).
Interpretation: "We are 95% confident that the true proportion lies within the interval."
Increasing Precision: Increase sample size to decrease margin of error.
Changing Confidence Level: Lower confidence level results in a narrower interval; higher confidence level results in a wider interval.
Example: If you take 100 samples and calculate 95% confidence intervals, about 95 intervals are expected to contain the true parameter.

Comparing Proportions Between Groups

Confidence intervals can be used to compare proportions between two groups, such as satisfaction rates in different states.

Key Formula for Difference in Proportions:
Interpretation: If the confidence interval for the difference does not include 0, there is evidence of a difference between groups.
Example Table:

Group	Sample Size	Proportion Satisfied	Confidence Interval
Texas	324	0.62	(0.548, 0.692)
Kansas	300	0.75	(0.697, 0.803)

Conclusion: If intervals do not overlap, there is a significant difference in satisfaction rates.

Central Limit Theorem (CLT)

Sampling Distributions and CLT

The Central Limit Theorem states that the sampling distribution of the sample mean (or proportion) approaches a normal distribution as the sample size increases, regardless of the population's distribution.

Key Points:
- Sample size should be large (n > 30 is a common rule).
- Allows use of normal probability methods for inference.
Example: Comparing full-time student proportions between years using confidence intervals.

Summary Table: Probability Types

Type	Definition	Example
Theoretical	Based on known possible outcomes	Probability of rolling a sum of 7 with two dice:
Empirical	Based on observed data from experiments	Simulate 36 dice rolls, count how many times sum is 7

Key Formulas

Probability:
Z-score:
Confidence Interval for Proportion:
Difference in Proportions:

Additional info: Some explanations and tables have been expanded for clarity and completeness based on standard statistics curriculum.