Skip to main content
Back

Statistics Test 1 Review: Descriptive Statistics, Probability, and Distributions

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Descriptive Statistics

Variables and Individuals

Descriptive statistics involve summarizing and organizing data to understand its main features. The two fundamental concepts are variables and individuals.

  • Individual: The entity being measured or observed (e.g., a batch of pizza dough, a city, a person).

  • Variable: A characteristic or property measured on individuals. Variables can be:

    • Qualitative (Categorical): No numeric value or intrinsic order (e.g., recipe type).

    • Quantitative (Numeric): Numeric values, which can be:

      • Discrete: Countable values, often integers (e.g., number of color blind men).

      • Continuous: Any value within a range (e.g., calcium concentration in water).

Types of Graphs for Data Visualization

Graphs are essential for visualizing data distributions and relationships. The choice of graph depends on the variable type.

  • Frequency Distribution: Shows counts for each category or value.

  • Relative Frequency Distribution: Shows proportions for each category or value.

  • Pie Chart: Visualizes proportions of categorical data.

  • Bar Graph: Compares frequencies or proportions for categorical data.

  • Histogram: Displays frequency of numeric data grouped into intervals.

  • Dotplot: Shows individual data points for small datasets.

  • Stemplot: Splits data into stems and leaves for quick visualization.

  • Boxplot: Summarizes distribution using quartiles and highlights outliers.

Example: Distribution of Recipes (Bar Graph)

Bar graph showing frequency of recipes A, B, C, D

Example: Distribution of Activation Times (Histogram)

Histogram of activation times

Example: Stemplot of Activation Times

Stemplot of activation times

Example: Boxplot of Yeast Activation Times

Boxplot of yeast activation times

Describing Distributions

To describe a distribution, consider:

  • Center: Mean, median, mode

  • Spread: Range, interquartile range (IQR), standard deviation (SD)

  • Shape: Symmetry, skewness, modality (number of peaks), outliers

Example: Water Hardness (Calcium Concentration)

Histogram of calcium concentrations

  • Individuals: Cities

  • Variable: Calcium concentration (continuous, numeric)

  • Distribution: Skewed right, possible outliers, unimodal

Probability Foundations

Basic Definitions

  • Sample Space (S): All possible outcomes of a random experiment.

  • Event: Any subset of the sample space.

  • Probability Model: Assigns probabilities to outcomes/events.

Probability Definitions

  • Symmetry Definition: If all outcomes are equally likely,

  • Frequentist Definition: Probability is the long-run proportion of times an event occurs in repeated trials.

Probability Rules

  • Addition Rule (Special): For mutually exclusive events,

  • Addition Rule (General):

  • Complement Rule:

Venn Diagrams

Venn diagrams visually represent relationships between events in a sample space, aiding in understanding probability rules.

Independence and Conditional Probability

  • Independent Events:

  • Conditional Probability:

Contingency Tables

Contingency tables display frequencies for combinations of two categorical variables. Marginal totals are row/column sums; the grand total is the sum of all cells.

Age Group

Some H.S.

High School

Some Uni.

University

Total

25-34

27

82

43

48

200

35-44

50

19

56

75

200

45-54

52

88

26

34

200

55-64

71

83

20

26

200

65+

101

59

20

20

200

Total

301

331

165

203

1000

Counting Rules

  • Permutations: Number of ways to arrange n objects:

  • Combinations: Number of ways to choose r objects from n:

Discrete Random Variables

Discrete Distributions

Discrete random variables take on countable values. Their distributions specify the probability for each possible value.

  • Mean (Expected Value):

  • Standard Deviation:

Binomial Distribution

  • Models the number of successes in n independent trials, each with probability p of success.

  • Probability:

  • Mean:

  • Standard Deviation:

Poisson Distribution

  • Models the number of events in a fixed interval, given a known average rate.

  • Probability:

  • Mean and SD:

Continuous Random Variables

Probability Density Functions (PDFs)

  • Probability is the area under the density curve over an interval.

  • Probability of a single value is zero.

  • Mean and SD are defined similarly to discrete variables, using integrals.

Exponential Distribution

  • Models time between events in a Poisson process.

  • PDF: for

  • Mean:

  • SD:

Normal Distribution

  • Symmetric, bell-shaped distribution defined by mean and standard deviation .

  • PDF:

  • Probabilities are found using standard normal tables or functions (e.g., pnorm, qnorm).

Example: Distribution of Calcium Concentrations (Histogram)

Histogram of calcium concentrations

Empirical Rule

  • For normal distributions:

    • ~68% of data within 1 SD of mean

    • ~95% within 2 SD

    • ~99.7% within 3 SD

Summary Table: Graph Types and Their Uses

Graph Type

Variable Type

Main Use

Bar Graph

Categorical

Compare frequencies/proportions

Pie Chart

Categorical

Show proportions

Histogram

Numeric (Continuous/Discrete)

Show distribution shape

Dotplot

Numeric (Small datasets)

Show individual values

Stemplot

Numeric (Small datasets)

Quick visualization

Boxplot

Numeric

Summarize spread, center, outliers

Additional info: Some explanations and table entries were expanded for completeness and clarity based on standard statistics curriculum.

Pearson Logo

Study Prep