Skip to main content
Back

Descriptive Statistics and Probability: Measures of Variation, Probability Concepts, and Counting Rules

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Variation

Range

The range is a simple measure of variation that indicates the spread between the largest and smallest values in a data set.

  • Definition: The difference between the maximum and minimum data entries in the set.

  • The data must be quantitative.

  • Formula:

Variation

Variation describes how data values are spread out or clustered together. Two data sets can have the same mean but different variations.

  • Greater variation means data values are more spread out.

  • Example: Two corporations with the same mean starting salary but different spreads (see bar charts for Corporation A and B).

Deviation, Variance, and Standard Deviation

  • Deviation: The difference between a data entry and the mean of the data set.

  • Population deviation:

  • Sample deviation:

  • Population Variance:

  • Population Standard Deviation:

  • Sample Variance:

  • Sample Standard Deviation:

  • Observations:

  • Standard deviation measures the spread of data around the mean.

  • Standard deviation is always non-negative; it is zero only if all entries are identical.

  • Larger standard deviation means data are more spread out.

Step-by-Step Calculation: Population Variance & Standard Deviation

Step

In Words

In Symbols

1

Find the mean of the population data set

2

Find deviation of each entry

3

Square each deviation

4

Add to get the sum of squares

5

Divide by N to get the population variance

6

Find the square root to get the population standard deviation

Step-by-Step Calculation: Sample Variance & Standard Deviation

Step

In Words

In Symbols

1

Find the mean of the sample data set

2

Find deviation of each entry

3

Square each deviation

4

Add to get the sum of squares

5

Divide by n - 1 to get the sample variance

6

Find the square root to get the sample standard deviation

Interpreting Standard Deviation

  • Standard deviation measures the typical amount an entry deviates from the mean.

  • Greater spread in data means a larger standard deviation.

Empirical Rule (68-95-99.7 Rule)

For data with a symmetric, bell-shaped distribution:

  • About 68% of data lie within one standard deviation of the mean.

  • About 95% within two standard deviations.

  • About 99.7% within three standard deviations.

Chebyshev's Theorem

  • For any data set, the proportion of values within k standard deviations (k > 1) of the mean is at least .

  • For k = 2: at least 75% of data within 2 standard deviations.

  • For k = 3: at least 88.9% of data within 3 standard deviations.

Standard Deviation for Grouped Data

  • For frequency distributions, use class midpoints and frequencies:

  • Where f = frequency of each class.

Coefficient of Variation (CV)

  • Describes the standard deviation as a percent of the mean.

Population data set:

Sample data set:

Quartiles, Interquartile Range, and Boxplots

Quartiles

  • Quartiles divide an ordered data set into four equal parts.

  • Q1: About 25% of data fall on or below Q1.

  • Q2: Median; about 50% of data fall on or below Q2.

  • Q3: About 75% of data fall on or below Q3.

Interquartile Range (IQR)

  • Measures the range of the middle 50% of the data.

  • Formula:

Using IQR to Identify Outliers

  1. Find Q1 and Q3.

  2. Compute IQR:

  3. Multiply IQR by 1.5:

  4. Subtract from Q1. Data below this are outliers.

  5. Add to Q3. Data above this are outliers.

Box and Whisker Plot

  • Exploratory data analysis tool that highlights important features of a data set.

  • Requires the five-number summary:

    1. Minimum entry

    2. First quartile (Q1)

    3. Median (Q2)

    4. Third quartile (Q3)

    5. Maximum entry

Steps to Draw a Box and Whisker Plot

  1. Find the five-number summary.

  2. Construct a horizontal scale for the data range.

  3. Plot the five numbers above the scale.

  4. Draw a box from Q1 to Q3, with a line at Q2 (median).

  5. Draw whiskers from the box to the minimum and maximum entries.

Percentiles and Other Fractiles

Fractiles

Summary

Symbols

Quartiles

Divides data into 4 equal parts

Q1, Q2, Q3

Deciles

Divides data into 10 equal parts

D1, D2, D3, ..., D9

Percentiles

Divides data into 100 equal parts

P1, P2, ..., P99

Percentile of a Data Entry

  • To find the percentile that corresponds to a specific data entry x:

Percentile of x =

The Standard Score (z-score)

  • Represents the number of standard deviations a value x falls from the mean μ.

Probability: Basic Concepts and Counting

Probability Experiments

  • Probability experiment: An action or trial with specific results (counts, measurements, or responses).

  • Outcome: The result of a single trial.

  • Sample space: The set of all possible outcomes.

  • Event: One or more outcomes; a subset of the sample space.

Simple and Compound Events

  • Simple event: Consists of a single outcome (e.g., tossing heads and rolling a 3).

  • Compound event: Consists of more than one outcome (e.g., tossing heads and rolling an even number).

The Fundamental Counting Principle

  • If one event can occur in m ways and a second in n ways, the two events can occur in ways.

  • Can be extended for more events in sequence.

Types of Probability

  • Classical (theoretical) probability: Each outcome is equally likely.

  • Empirical (statistical) probability: Based on observed data.

, where f = frequency of event E, n = total frequency

  • Subjective probability: Based on intuition, educated guesses, or estimates.

Law of Large Numbers

  • As an experiment is repeated, the empirical probability approaches the theoretical probability.

  • Example: Probability of tossing a head approaches 0.5 as the number of tosses increases.

Range of Probabilities Rule

  • Probability of any event E is between 0 and 1, inclusive:

Complementary Events

  • The complement of event E (denoted E') is the set of all outcomes not in E.

Conditional Probability and the Multiplication Rule

Conditional Probability

  • The probability of event B occurring, given that event A has already occurred.

  • Denoted (read as "probability of B, given A").

Independent and Dependent Events

  • Independent events: The occurrence of one does not affect the probability of the other.

  • or

  • Events that are not independent are dependent.

The Multiplication Rule

  • For two events A and B, the probability that both occur in sequence:

  • General rule:

  • For independent events:

  • Can be extended for more than two events.

Pearson Logo

Study Prep