Skip to main content
Back

Study Notes: Random Variables, Sampling Distributions, Confidence Intervals, and Hypothesis Testing

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 4: Random Variables and Probability Distributions

Random Variables

A random variable is a numeric value associated with the outcome of a probability experiment. Random variables are classified as either discrete or continuous based on the type of values they can assume.

  • Discrete Random Variable: Takes on countable values, often whole numbers, typically associated with counting (e.g., number of heads in coin tosses).

  • Continuous Random Variable: Takes on any value within an interval, typically associated with measurement (e.g., weight, time, volume).

Common Notation: Random variables are often denoted by capital letters such as X, Y, or Z.

Key Properties of Discrete Random Variables

  • Mean (Expected Value): The mean of a discrete random variable X is given by:

  • Variance: The variance of a discrete random variable X is:

  • Probability Distribution: Represented by a table or chart listing all possible values and their probabilities.

Key Properties of Continuous Random Variables

  • Values are measured, not counted, and can take any value within an interval.

  • Probability distribution is represented by a probability density function (PDF), a smooth curve where the total area under the curve equals 1.

  • The probability that the variable falls within a certain interval is the area under the curve over that interval.

Key Discrete Random Variables

  • Binomial Random Variable:

    • Fixed number of independent trials (n).

    • Each trial has two possible outcomes: success or failure.

    • Probability of success (p) is constant for each trial.

    • Counts the number of successes in n trials.

    • Probability mass function:

    • Calculator commands: BINOMPDF (exactly X successes), BINOMCDF (X or fewer successes).

  • Poisson Random Variable:

    • Counts the number of events (successes) in a fixed interval of time or space.

    • Events occur independently and at a constant average rate (λ, lambda).

    • Probability mass function:

    • Calculator commands: POISSONPDF (exactly X successes), POISSONCDF (X or fewer successes).

  • Hypergeometric Random Variable:

    • Used when sampling without replacement from a finite population with known composition.

    • Example: Drawing marbles of different colors from a jar without replacement.

Normal Random Variable

  • The normal distribution is the most important continuous random variable, with a bell-shaped probability density function centered at the mean (μ) and standard deviation (σ).

  • Probabilities are computed using the NormalCDF command; percentiles are found using INVNORM.

  • The standard normal random variable (Z) has mean 0 and standard deviation 1. Z-scores indicate the number of standard deviations from the mean.

  • Useful for comparing values from different normal distributions (e.g., SAT vs. ACT scores).

Example

  • Suppose X is the number of heads in 10 coin tosses (binomial, n=10, p=0.5). The probability of exactly 6 heads is:

Chapter 5: Sampling Distributions

Sampling Distributions

A sampling distribution is the probability distribution of a statistic (such as the sample mean or sample proportion) computed from a random sample. It describes how the statistic varies from sample to sample.

  • Sample averages (\bar{x}) and sample proportions (\hat{p}) are continuous random variables.

  • Their probability distributions are called sampling distributions.

Key Properties

  • For the sample mean:

    • Expected value:

    • Standard deviation (standard error): (approximated by if population σ is unknown)

  • For the sample proportion:

    • Expected value:

    • Standard deviation:

  • Central Limit Theorem (CLT): For large sample sizes (n ≥ 30), the sampling distribution of the sample mean is approximately normal, regardless of the population's distribution.

  • If the population is normal, the sampling distribution of the sample mean is normal for any sample size.

  • Larger samples yield tighter (less variable) sampling distributions; standard error decreases as n increases.

  • Unbiased Estimator: An estimator whose expected value equals the population parameter (e.g., sample mean for population mean).

  • Minimum Variance: Among unbiased estimators, the one with the smallest variance is preferred.

Example

  • If the population mean is 100 and standard deviation is 15, for samples of size 36:

Chapter 6: Confidence Intervals for μ and p

Confidence Intervals

A confidence interval (CI) is a range of values, derived from sample statistics, that is likely to contain the population parameter with a specified level of confidence (e.g., 95%).

  • Confidence Level (CL): The probability that the CI contains the parameter in repeated samples (e.g., 0.95, 0.90, 0.99).

  • Margin of Error: The half-width of the confidence interval; reflects sampling variability.

Large Sample Confidence Interval for μ

  • Relies on the Central Limit Theorem; use when n is large (n ≥ 30).

  • Formula:

  • zα/2 is found using the invnorm calculator command, with α = 1 – CL.

Small Sample Confidence Interval for μ

  • Use when n is small and the population is approximately normal.

  • Formula:

  • tα/2, df is found using the invt calculator command, with degrees of freedom (df) = n – 1.

Confidence Interval for p (Proportion)

  • Large sample: At least 15 successes and 15 failures in the sample.

  • Formula:

  • Calculator command: 1prop-ZINT.

  • For small samples, special methods are required (no calculator command).

Sample Size Determination

  • To achieve a desired margin of error, solve for n in the margin of error formula.

  • If p is unknown, use p = 0.5 for maximum variability.

Alpha (α)

  • α = 1 – confidence level; determines the critical z or t values for the CI.

  • The area in each tail is α/2.

Example

  • For a sample mean of 50, s = 10, n = 100, and 95% confidence: CI: (48.04, 51.96)

Chapter 7: Hypothesis Testing for μ and p

Hypothesis Testing

A hypothesis test is a statistical procedure to test claims about population parameters using sample data.

Key Components

  • Null Hypothesis (H0): The status quo or default claim; always contains an equality (e.g., μ = μ0).

  • Alternative Hypothesis (Ha): The claim we seek evidence for; uses >, <, or ≠.

  • Type I Error (α): Rejecting H0 when it is true.

  • Type II Error (β): Failing to reject H0 when it is false.

  • Significance Level (α): Probability of a Type I error; chosen by the researcher (e.g., 0.05).

  • Test Statistic: The sample statistic converted to a z or t value.

  • Critical Value: The z or t value that marks the boundary of the rejection region.

  • Rejection Region: The set of values for which H0 is rejected.

  • P-value: The probability, under H0, of observing a result as extreme as the sample result.

Types of Tests

  • One-tailed Test (Left): Ha: parameter < value

  • One-tailed Test (Right): Ha: parameter > value

  • Two-tailed Test: Ha: parameter ≠ value

Decision Rules

  • If p-value < α, reject H0; sufficient evidence for Ha.

  • If p-value > α, fail to reject H0; insufficient evidence for Ha.

  • If test statistic falls in the rejection region, reject H0.

Reporting Decisions

  • "There is sufficient evidence at α = xx to reject the null hypothesis and accept the alternative hypothesis."

  • "There is insufficient evidence at α = xx to reject the null hypothesis. Therefore, the null hypothesis is plausible."

Test Types and Calculator Commands

  • Large Sample Test for μ: Use normal distribution (ZTEST).

  • Small Sample Test for μ: Use t distribution (TTEST), if population is normal.

  • Large Sample Test for p: Use normal distribution (1propZtest), requires at least 15 successes and 15 failures (n*p0 ≥ 15, n*(1–p0) ≥ 15).

Example

  • Suppose H0: μ = 100, Ha: μ > 100, sample mean = 105, s = 10, n = 25, α = 0.05. Test statistic: Compare t to critical value from t-table with df = 24.

Summary Table: Key Random Variables

Random Variable

Context

Parameters

Mean

Variance

Calculator Command

Binomial

Fixed number of trials, success/failure

n, p

BINOMPDF, BINOMCDF

Poisson

Events in interval (time/space)

λ

POISSONPDF, POISSONCDF

Hypergeometric

Sampling without replacement

N, K, n

--

Normal

Measurement, continuous

μ, σ

NormalCDF, INVNORM

Additional info: Table entries for mean and variance of the hypergeometric distribution are standard formulas inferred for completeness.

Pearson Logo

Study Prep