Randomness & Probability: Foundations and Applications

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Randomness & Probability

Introduction to Randomness

Randomness is a fundamental concept in statistics, describing situations where the outcome cannot be predicted with certainty, even if all possible outcomes are known. Understanding randomness is essential for interpreting data, designing experiments, and calculating probabilities.

Random Outcome: An outcome is random if we know the possible values it can take, but not which particular value will occur.
Random Number Generator: A device or algorithm that produces a sequence of numbers with no predictable pattern.
Human Randomness: People are generally poor at generating truly random sequences compared to machines.

Learning Objectives

Define and generate random numbers.
Interpret randomness and probability in the long run.
Describe real-world situations as random experiments.
Draw appropriate conclusions about randomness and probabilities.

Random Numbers

Definition and Generation

Random numbers are values selected from a set of possible outcomes, where each outcome has an equal chance of being chosen. In practice, random numbers are generated using physical processes (e.g., rolling dice, flipping coins) or computational algorithms.

Definition: A series of numbers where you have no control over which numbers will be selected.
Example: Using a computer to generate a random digit between 0 and 9.

Human vs. Machine Randomness

Humans often fail to produce truly random sequences due to subconscious patterns and biases. Machines, using algorithms, can approximate randomness more effectively.

Key Point: People are pretty bad at picking random numbers compared to random number generators.
Example: When asked to pick random digits, humans tend to avoid repeating numbers and may favor certain digits.

Sample Space

Definition and Examples

The sample space of a random experiment is the set of all possible outcomes.

Example: Flipping two coins. The sample space is: { (heads, heads), (heads, tails), (tails, heads), (tails, tails) }
Random Digit Example: Picking a random digit between 0 and 9. The sample space is: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

Probability of Outcomes

Equal Probability and Independence

If numbers are truly random, each number has an equal probability of being chosen. The selection of one number does not affect the probability of selecting another; this is known as independence.

Chance "has no memory": Previous selections do not influence future outcomes.
Probability Formula: For digits 0-9:

Relative Frequency

Probability can be estimated by the relative frequency of an outcome in repeated trials.

Formula:
Example: If '0' is picked 8 times out of 82 trials, relative frequency =

Comparing Observed and Expected Distributions

Expected Distribution of Random Digits

When picking random digits between 0 and 9, each digit should appear with equal frequency in a large number of trials.

Digit	0	1	2	3	4	5	6	7	8	9	Total
Expected Probability	0.1	0.1	0.1	0.1	0.1	0.1	0.1	0.1	0.1	0.1	1

Observed Distribution

In practice, the observed frequencies may deviate from the expected due to randomness or bias (especially if humans are picking numbers).

Example: A histogram of digits picked by people may show some digits are chosen more often than others.

Assessing Variability and Statistical Inference

Predicting Variability

Variability in the observed frequencies can be predicted using statistical models such as the binomial distribution or by calculating the variance of a random variable.

Binomial Probability Formula: where is the number of trials, is the number of successes, and is the probability of success.
Variance of a Random Variable:
Simulation: Computer simulations (e.g., using R) can be used to model the expected distribution and assess how extreme an observed value is.

Percentiles and p-values

To determine how unusual an observed frequency is, calculate its percentile among simulated outcomes. This is similar to finding a p-value in hypothesis testing.

Percentile: The proportion of simulated values as extreme or more extreme than the observed value.
p-value: The probability of observing a result at least as extreme as the one obtained, assuming the null hypothesis is true.

Summary Table: Expected vs. Observed Random Digit Distribution

Digit	Expected Frequency (N=82)	Observed Frequency
0	8.2	Varies (e.g., 8)
1	8.2	Varies
2	8.2	Varies
3	8.2	Varies
4	8.2	Varies
5	8.2	Varies
6	8.2	Varies
7	8.2	Varies
8	8.2	Varies
9	8.2	Varies

Additional info: Observed frequencies are typically not exactly equal due to random variation and human bias.

Key Takeaways

Randomness is unpredictable, but the set of possible outcomes is known.
Probability quantifies the likelihood of outcomes and is estimated by relative frequency in repeated trials.
Sample space lists all possible outcomes of a random experiment.
Statistical inference allows us to compare observed data to expected models and assess variability.