Skip to main content
Back

Random Variables, Probability Models, Sampling Distributions, and Hypothesis Testing

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Random Variables and Probability Models

Definition and Types of Random Variables

A random variable is a numerical value determined by the outcome of a random event. Random variables are fundamental in statistics for modeling uncertainty and variability.

  • Discrete Random Variable: Can take on a countable number of distinct values. Example: Number of heads in 10 coin tosses.

  • Continuous Random Variable: Can take any value within a given range. Example: Height of students in a class.

Notation: Random variables are typically denoted by capital letters such as X, Y, or Z.

Probability Model

A probability model describes all possible values of a random variable and their associated probabilities.

  • For discrete random variables, the model lists each possible value and its probability.

  • For continuous random variables, the model specifies a probability density function.

Expected Value and Variance

The expected value (mean) and variance are key properties of random variables.

  • Expected Value (Mean):

  • Variance:

Bernoulli Trials

A Bernoulli trial is an experiment with only two possible outcomes: success or failure.

  • Probability of success: p

  • Probability of failure: q = 1 - p

  • Each trial is independent.

  • Examples: Tossing a coin, yes/no survey responses, basketball free throws.

Uniform Model

If a random variable X can take values 1, 2, ..., n, and each outcome is equally likely, X has a discrete uniform distribution U[1,...,n].

Standardization (Z-score)

The Z-score measures how many standard deviations a value is from the mean.

Example: To find , calculate the Z-score and use the normal distribution table.

Sum of Random Variables

When adding independent random variables, their expected values and variances add:

  • Expected Value:

Example: If and , then .

  • Variance:

Example: If for both, then .

Distribution of Sample Proportions

Assumptions and Conditions

To use the sampling distribution of sample proportions, certain conditions must be met:

  • Independence Assumption: Sampled values must be independent.

  • Sample Size Assumption: Sample size n must be large enough.

  • Randomization Condition: Data should come from a randomized experiment or a simple random sample.

  • 10% Condition: If sampling without replacement, n should be no more than 10% of the population.

  • Success/Failure Condition: Both and should be at least 10.

The Central Limit Theorem (CLT)

Statement and Implications

The Central Limit Theorem states that the sampling distribution of the mean approaches a normal distribution as the sample size increases, regardless of the population's shape.

  • For highly skewed distributions, larger sample sizes (dozens or hundreds) may be needed for normality.

Application Example

  • If only 20 recent graduates' salaries are sampled, concerns include small sample size, unknown population size, and unknown standard deviation.

  • Confidence intervals should be based on t-models when the population standard deviation is unknown.

  • Increasing sample size (e.g., to 60) makes the confidence interval more precise.

Hypothesis Testing

Null and Alternative Hypotheses

Hypothesis testing begins with a null hypothesis (), which assumes no effect or no change. The alternative hypothesis () represents all other possible values.

  • Null Hypothesis:

  • Alternative Hypothesis:

  • Example: ,

Standard Error and Z-score for Proportions

  • Standard Error of a Proportion:

  • Z-score for Proportion:

P-value

The p-value is the probability, under the null hypothesis, of obtaining a result at least as extreme as the observed result. It is not the probability that the null hypothesis is true.

  • P-value is calculated using the sampling distribution (often normal or t-distribution).

  • Excel function: T.DIST.RT can be used for right-tailed t-distribution p-values.

Summary Table: Key Concepts

Concept

Definition

Formula

Example/Application

Random Variable

Numerical outcome of a random event

N/A

Policy payout, coin toss

Expected Value

Mean of random variable

Average payout

Variance

Spread of random variable

Risk assessment

Bernoulli Trial

Experiment with two outcomes

N/A

Coin toss, yes/no survey

Uniform Distribution

All outcomes equally likely

Dice roll

Z-score

Standardized value

Normal distribution analysis

Central Limit Theorem

Sampling distribution approaches normality

N/A

Mean salary estimation

Hypothesis Testing

Test claim about population

Proportion test

P-value

Probability of observed result under

N/A

Statistical significance

Pearson Logo

Study Prep