Skip to main content
Back

Core Concepts in Statistics: Data Types, Descriptive Statistics, Probability, and Discrete Distributions

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Types of Data

Qualitative and Quantitative Data

Understanding the type of data is foundational in statistics, as it determines the appropriate analysis methods. Data can be classified as either qualitative (categorical) or quantitative (numerical).

  • Qualitative Data: Non-numerical data describing categories or characteristics.

    • Nominal: Categories without a natural order. Example: Favorite sport (hockey, golf, soccer).

    • Ordinal: Categories with a natural order. Example: Movie ratings (bad, fair, good).

  • Quantitative Data: Numerical data that allows for arithmetic operations.

    • Discrete: Data obtained by counting; only specific values are possible. Example: Number of goals scored.

    • Continuous: Data obtained by measuring; any value within a range is possible. Example: Height or temperature.

Measures of Central Tendency

Mean, Median, and Mode

Measures of central tendency summarize the center of a dataset, providing a single value that represents the entire distribution.

  • Mean (Average): The sum of all data values divided by the number of values. Formula for Sample Mean: Example: For test scores 70, 75, 80, 85, 85, 90, 92:

  • Median: The middle value when data is ordered. Location Formula: Example: For sorted scores 70, 75, 80, 85, 85, 90, 92 (), the median is the 4th value: 85.

  • Mode: The value that appears most frequently. Example: In the dataset above, 85 appears twice and is the mode.

Note: Datasets can be bimodal (two modes), multimodal (more than two), or have no mode.

Measures of Spread

Range, Interquartile Range (IQR), and Standard Deviation

Measures of spread describe the variability or dispersion within a dataset.

  • Range: Difference between the maximum and minimum values. Formula: Example: For hours studied: 1, 2, 4, 5, 5, 6, 7, 10, hours.

  • Interquartile Range (IQR): Spread of the middle 50% of data. Formula: Example: For the same data: , , , so

  • Standard Deviation (s): Measures the typical distance of data values from the mean. Formula for Sample Standard Deviation: Example: For hours studied (mean = 5): Squared deviations sum to 56, hours.

Note: For population standard deviation, divide by instead of .

Graphing Data

Visualizing Qualitative and Quantitative Data

Graphs are essential for visualizing data distributions, trends, and outliers. The choice of graph depends on the data type.

  • Qualitative Data Graphs:

    • Bar Chart: Displays frequency for each category; bars are separated.

    • Pie Chart: Shows each category's proportion as a sector of a circle.

  • Quantitative Data Graphs:

    • Histogram: For continuous data; bars touch, representing frequency within intervals (bins).

    • Box Plot (Box-and-Whisker Plot): Visualizes the five-number summary (min, , median, , max) and highlights outliers.

Probability Basics

Theoretical and Experimental Probability

Probability quantifies the likelihood of an event, ranging from 0 (impossible) to 1 (certain).

  • Theoretical Probability: Based on possible outcomes under ideal conditions. Formula: Example: Probability of rolling a 3 on a die: Probability of rolling two 3s in a row:

  • Experimental Probability: Based on observed outcomes in experiments. Formula: Example: Rolling a die 20 times and getting two 3s:

Law of Large Numbers: As the number of trials increases, experimental probability approaches theoretical probability.

Discrete Probability Distributions

Discrete Random Variables and the Binomial Distribution

A discrete probability distribution lists all possible values of a discrete random variable and their probabilities. The binomial distribution is a key example, describing the number of successes in a fixed number of independent trials with constant probability of success.

  • Discrete Random Variable: Takes on countable values (e.g., number of goals scored).

  • Binomial Distribution:

    • Fixed number of trials ()

    • Each trial has two outcomes (success/failure)

    • Probability of success () is constant

    • Trials are independent

    Example: Rolling a six-sided die three times, counting sixes (, )

Probability Mass Function (PMF) for Binomial Distribution

The PMF gives the probability of observing exactly successes in trials:

Example: For rolling a six three times (, ):

, where

Tabular Representation of the Binomial Distribution

The table below lists the probabilities for each possible number of sixes rolled in three attempts:

Number of Sixes (k)

Probability

0

1

2

3

Note: The sum of all probabilities in a discrete probability distribution is always 1.

Additional info: Other types of discrete distributions include hypergeometric, geometric, and Poisson, but the binomial is most commonly encountered at this level.

Pearson Logo

Study Prep