Back(Lecture 13) Probability Distributions: Discrete and Continuous Random Variables
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Probability Distributions
Randomness and Random Variables
Probability theory is foundational to statistics, allowing us to model and analyze random phenomena. A random variable is a numerical measurement of the outcome of a random phenomenon. Randomness in data typically arises from random sampling or randomized experiments.
Random Variable: Denoted by capital letters (e.g., X), it represents the outcome of a random process.
Example: Flipping a coin three times; X could be the number of heads observed.
Possible Values: Lowercase letters (e.g., x) denote specific values the random variable can take.
Probability Distribution
A probability distribution of a random variable specifies its possible values and the probability associated with each value. This allows us to predict the likelihood of outcomes over the long run.
Discrete Random Variable: Takes a set of separate values (e.g., 0, 1, 2, ...).
Probability Distribution: For each value x, the probability P(x) satisfies and the sum of all probabilities equals 1.
Example: Best of Seven Series
Consider a best of seven games series. What is the probability that the series will be decided by at least six games?
Number of Games x | Probability P(x) |
|---|---|
4 | 1/8 = 0.125 |
5 | 1/4 = 0.25 |
6 | 5/16 = 0.3125 |
7 | 5/16 = 0.3125 |
Probability of at least six games:
Summarizing Probability Distributions
Mean and Standard Deviation
To describe the center and variability of a probability distribution, we use the mean and standard deviation. These are called parameters and are typically denoted by Greek letters.
Mean (μ): Represents the expected value of the distribution.
Standard Deviation (σ): Measures the variability from the mean.
Mean of a Discrete Probability Distribution
The mean of a discrete random variable is calculated as a weighted average:
Formula:
The sum is taken over all possible values of x.
Values of x that are more likely receive greater weight, P(x).
Expected Value
The mean μ is also called the expected value of the random variable. It reflects the long-run average outcome, not necessarily a value that will be observed in a single trial.
Example: Responding to Risk
Suppose you are given $1000 to invest. You must choose between:
A sure gain of $500.
A 0.50 chance of gaining $1000 and a 0.50 chance of gaining nothing.
Expected gain for strategy 1:
Expected gain for strategy 2:
Risk-Taking Probability Distribution
Gain ($) | Probability |
|---|---|
0 | 0.50 |
1000 | 0.50 |
Standard Deviation of a Probability Distribution
The standard deviation (σ) measures the average distance of the values from the mean. Larger values of σ indicate greater variability.
Interpretation: σ describes how far values of the random variable fall, on average, from the expected value.
Probability Distributions of Categorical Variables
Categorical Random Variables
While most random variables are quantitative, categorical variables with two categories can be represented numerically (e.g., 0 and 1). For such variables, the mean is the probability of the outcome coded as 1.
Example: Success/failure, yes/no outcomes.
Continuous Random Variables
Definition and Examples
A continuous random variable has possible values that form an interval, such as time, age, height, or weight. These are usually measured discretely due to rounding, but conceptually they can take any value within an interval.
Probability Distribution of a Continuous Random Variable
The probability distribution of a continuous random variable is specified by a curve (probability density function) that determines the probability the variable falls within any interval.
Each interval has probability between 0 and 1.
The total area under the curve equals 1.
Probability for an interval: Given by the area under the curve above that interval.
Example: Commuting Time
If the area under the curve for commuting times less than 15 minutes is 0.29, then the probability that a randomly selected commuting time is less than 15 minutes is 0.29.
If the area for times greater than 45 minutes is 0.15, then the probability for that interval is 0.15.
Additional info: The notes provide foundational concepts for probability distributions, including both discrete and continuous cases, and their application to real-world scenarios.