Skip to main content
Back

Statistics Key Concepts and Definitions

Control buttons has been changed to "navigation" mode.
1/29
  • Addition Rule

    The rule to find the probability that one or the other of two events occurs. For mutually exclusive events, \(P(A \cup B) = P(A) + P(B)\).

  • Bias

    A systematic error that causes a sample statistic to differ from the population parameter in a consistent way.

  • Binomial Setting

    A scenario with fixed number of independent trials, each with two possible outcomes and constant probability of success.

  • Binomial Distribution

    The probability distribution of the number of successes in a fixed number of independent Bernoulli trials.

  • Binomial Random Variable

    A random variable that counts the number of successes in a binomial setting.

  • Blinding

    A technique in experiments where subjects or researchers do not know which treatment was assigned to reduce bias.

  • Block Design

    An experimental design that groups subjects with similar characteristics into blocks to reduce variability.

  • Central Limit Theorem

    States that the sampling distribution of the sample mean approaches a normal distribution as sample size increases, regardless of population distribution.

  • Chi-Square Distribution

    A distribution used for tests of independence and goodness-of-fit, based on the sum of squared standard normal variables.

  • Chi-Square Statistic

    A measure of how observed counts differ from expected counts, calculated as \(\sum \frac{(O - E)^2}{E}\).

  • Chi-Square GOF Test

    Goodness-of-fit test to determine if observed categorical data matches an expected distribution.

  • Cluster Sample

    A sampling method where the population is divided into clusters, some clusters are randomly selected, and all members in chosen clusters are sampled.

  • Coefficient of Determination

    Denoted \(R^2\), it measures the proportion of variance in the response variable explained by the explanatory variable.

  • Conditional Probability

    The probability of event A given event B has occurred, \(P(A|B) = \frac{P(A \cap B)}{P(B)}\).

  • Confidence Interval for a Population Mean

    An interval estimate that likely contains the population mean with a specified confidence level.

  • Confidence Interval for a Population Proportion

    An interval estimate that likely contains the population proportion with a specified confidence level.

  • Correlation

    A measure of the strength and direction of a linear relationship between two quantitative variables, ranging from -1 to 1.

  • Discrete Random Variable

    A random variable that takes on countable values, often integers.

  • Empirical Rule

    In a normal distribution, about 68% of data falls within 1 standard deviation, 95% within 2, and 99.7% within 3.

  • Law of Large Numbers

    As the number of trials increases, the sample mean approaches the population mean.

  • Margin of Error

    The maximum expected difference between the sample statistic and the population parameter in a confidence interval.

  • Normal Distribution

    A symmetric, bell-shaped distribution defined by its mean and standard deviation.

  • P-Value

    The probability of obtaining a test statistic at least as extreme as the one observed, assuming the null hypothesis is true.

  • Random Variable

    A variable whose value depends on the outcome of a random phenomenon.

  • Significance Level

    The threshold probability for rejecting the null hypothesis, commonly denoted \(\alpha\).

  • Standard Deviation

    A measure of the spread or variability of a set of data around the mean.

  • Type I Error

    Rejecting the null hypothesis when it is actually true.

  • Type II Error

    Failing to reject the null hypothesis when it is actually false.

  • z-score

    The number of standard deviations a data point is from the mean, calculated as \(z = \frac{x - \mu}{\sigma}\).