BackStat 101 Exam 2 Study Guide: Chapters 6–15 (Contingency Tables, Sampling, Probability, Binomial, and Sampling Distributions)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Contingency Tables and Distributions
Contingency Tables
Contingency tables are used to summarize the relationship between two categorical variables. They display how individuals are distributed across each variable.
Contingency Table: A table showing the distribution of individuals along each variable.
Marginal Distribution: The totals for each row or column in a contingency table.
Conditional Distribution: The distribution of one variable for cases that satisfy a condition on another variable. For example, the distribution of Event B given Event A occurs.
Example: Eye Color and Gender
Eye Color | Blue | Green | Brown | Total |
|---|---|---|---|---|
Male | 5 | 7 | 15 | 27 |
Female | 6 | 2 | 10 | 18 |
Total | 11 | 9 | 25 | 45 |
Marginal Distribution of Gender: Male: 60%, Female: 40%
Conditional Probability Example: Percentage of females with blue eyes:
Sampling and Experimental Design
Sampling Concepts
Sampling is the process of selecting a subset of individuals from a population to estimate characteristics of the whole population.
Population: The entire group of individuals or instances about whom we hope to learn.
Sample: A representative subset of the population.
Sample Survey: A study that asks questions of a sample drawn from a population.
Randomization: Each individual has a fair, random chance of selection.
Census: A sample that consists of the entire population.
Population Parameter: A numerically valued attribute of a model for a population (e.g., mean income).
Sample Statistic: A value calculated for sample data (e.g., sample mean).
Sampling Frame: The list of individuals from whom the sample is drawn.
Types of Sampling Methods
Simple Random Sample (SRS): Each set of n elements in the population has an equal chance of selection.
Stratified Random Sampling: Population divided into strata, random samples drawn from each stratum.
Cluster Sampling: Entire groups (clusters) chosen at random; clusters are heterogeneous.
Multistage Sampling: Combines several sampling methods.
Systematic Sample: Individuals selected systematically from a sampling frame.
Types of Bias
Voluntary Response Bias: Individuals choose whether to participate.
Undercoverage Bias: Some population members are less represented.
Nonresponse Bias: Large fraction of those sampled fails to respond.
Response Bias: Survey design influences responses.
Experimental Design
Observational Study: No manipulation of factors; can be retrospective or prospective.
Experiment: Manipulates factor levels to create treatments, randomly assigns subjects, compares responses.
Factor: Variable whose levels are manipulated.
Response Variable: Variable whose values are compared across treatments.
Levels: Specific values chosen for a factor.
Treatment: Process or intervention applied to experimental units.
Block: Groups of similar experimental units; randomize within blocks.
Randomization: Assign units to treatment groups randomly.
Control: Control aspects not being studied.
Replicate: Use as many subjects as possible.
Statistically Significant: Observed difference unlikely to have occurred naturally.
Types of Experiments: Completely Randomized Design (CRD), Randomized Block Design (RBD), Matched Pair Design.
Blinding: Individuals unaware of treatment allocation.
Single/Double Blind: Single: one group blinded; Double: both groups blinded.
Placebo: Treatment known to have no effect.
Placebo Effect: Response to placebo treatment.
Confounding: Effects of two factors cannot be separated.
Lurking Variable: Variable associated with both explanatory and response variables.
Probability and Its Rules
Basic Probability Concepts
Probability quantifies the likelihood of events in random phenomena.
Random Phenomenon: Outcomes are possible, but which will occur is unknown.
Trial: Single attempt or realization.
Outcome: Value measured or observed in a trial.
Event: Collection of outcomes; denoted by capital letters.
Sample Space: Collection of all possible outcomes; denoted by S or .
Law of Large Numbers (LLN): Long-run relative frequency approaches true probability as trials increase.
Independence: Occurrence of one event does not affect the probability of another.
Probability: Number between 0 and 1 indicating likelihood; for event A.
Empirical Probability: Based on observed frequencies.
Theoretical Probability: Based on models;
Personal Probability: Subjective degree of belief.
Rules of Probability
Probability Assignment Rule: ,
Complement Rule:
Addition Rule (Disjoint Events):
Multiplication Rule (Independent Events):
General Addition Rule:
Conditional Probability:
General Multiplication Rule:
Independence:
Bayes Rule:
Tree Diagrams
Tree diagrams help visualize conditional probabilities and sequences of events.
Example: Calculating using Bayes Rule with given probabilities.
Binomial Distribution
Definition and Properties
The binomial distribution models the number of successes in a fixed number of independent trials, each with the same probability of success.
Conditions: Two possible outcomes (success/failure), constant probability , independent trials, fixed number .
Success/Failure Condition: Binomial model is approximately normal if and .
Binomial Model Formulas
Probability of k successes:
Mean:
Standard Deviation:
Binomial Coefficient:
Sampling Distributions and Central Limit Theorem
Sampling Distribution
The sampling distribution describes the distribution of a statistic (e.g., sample mean) over all possible samples from the same population.
Sampling Distribution: Distribution of statistics over all possible samples.
Sampling Distribution Model: Practical model for theoretical sampling distribution.
Sampling Error: Variation from sample to sample.
Central Limit Theorem (CLT)
Statement: For large , the sampling distribution of the sample mean is approximately normal, regardless of population distribution, if observations are independent.
Sampling Distribution Model for Mean: If independence and random sampling are met, and is large,
Example: Battery Life
Population Mean: hours
Population SD: hours
Sample Size:
Mean of Sample Mean:
SD of Sample Mean:
Model:
Interval for 99.7%:
Worked Examples and Applications
Probability Table Example
X | 3 | 5 | 6 | 8 | 10 |
|---|---|---|---|---|---|
P(X=x) | 0.2 | 0.1 | 0.3 | 0.3 | 0.1 |
Note: The sum of probabilities must equal 1.
Binomial Example: Defective Reams
Given: ,
Probability of exactly 4 defective reams:
Mean:
SD:
Conditional Probability Example: Animal Shelter
Cat | Dog | Total | |
|---|---|---|---|
Male | 6 | 8 | 14 |
Female | 12 | 16 | 28 |
Total | 18 | 24 | 42 |
P(Male | Cat):
P(Cat | Female):
P(Female | Dog):
Independence Example
Definition 1:
Definition 2:
Application: For the animal shelter, being male and being a dog are independent because both definitions are satisfied.
Tree Diagram Example: Drunk Driving Checkpoint
P(Drink): 0.12
P(Not Drink): 0.88
P(Detain | Drink): 0.8
P(Detain | Not Drink): 0.2
P(Detain):
P(Drink | Detain):
Summary Table: Sampling Methods
Sampling Method | Description |
|---|---|
Simple Random Sample | Every individual has equal chance |
Stratified Sample | Population divided into strata, random samples from each |
Cluster Sample | Entire groups chosen at random |
Multistage Sample | Combines several methods |
Convenience Sample | Individuals chosen based on ease of access |
Summary Table: Types of Bias
Bias Type | Description |
|---|---|
Voluntary Response Bias | Individuals choose to participate |
Nonresponse Bias | Sampled individuals fail to respond |
Response Bias | Survey design influences responses |
Undercoverage | Some population members are less represented |
Summary Table: Types of Experimental Designs
Design Type | Description |
|---|---|
Completely Randomized Design (CRD) | All units have equal chance of any treatment |
Randomized Block Design (RBD) | Random assignment within blocks |
Matched Pair Design | Pairs of similar subjects, one receives treatment |
Summary Table: Probability Rules
Rule | Formula |
|---|---|
Complement | |
Addition (Disjoint) | |
Multiplication (Independent) | |
General Addition | |
Conditional Probability | |
General Multiplication | |
Bayes Rule |
Summary Table: Binomial Distribution
Parameter | Formula |
|---|---|
Probability of k successes | |
Mean | |
Standard Deviation |
Summary Table: Sampling Distribution of the Mean
Parameter | Formula |
|---|---|
Mean | |
Standard Deviation | |
Model |
Additional info: These notes expand on brief points with academic context, definitions, formulas, and examples, and include summary tables for quick reference.