Sampling Methods and Experimental Design in Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Sampling Methods and Experimental Design

Introduction

When analyzing sample data, it is essential to use an appropriate method for collecting those data. This section covers key concepts in sampling methods, experimental design, and types of observational studies, which are foundational for statistical inference.

Placebos and Experimental Design

Definition of Placebo

Placebo: A harmless and ineffective pill, medicine, or procedure sometimes used for psychological benefit or as a control in experiments. Placebos are used by researchers for comparison to other treatments.

Example: The Salk Vaccine Experiment

In 1954, an experiment was conducted to test the effectiveness of the Salk vaccine in preventing polio.
Design: 401,974 children were randomly assigned to two groups:
- Treatment group: 200,745 children received the Salk vaccine.
- Placebo group: 201,229 children received a placebo (no active drug).
Assignment was done by random selection, ensuring groups were equivalent.
Results: 33 children in the vaccine group and 115 in the placebo group developed paralytic polio.

Conclusion: The experiment demonstrated the effectiveness of the Salk vaccine, as fewer cases of polio occurred in the treatment group.

Key Terms in Experiments

Experiment: A study in which a treatment is applied to individuals (called experimental units or subjects), and the effects are observed.
Observational Study: A study where characteristics are observed and measured without assigning treatments.

Observational Studies vs. Experiments

Example: Ice Cream and Drownings

Observational Study: Observing past data may show a correlation between ice cream sales and drownings, but this is due to a lurking variable (temperature), not causation.
Experiment: Assigning one group to eat ice cream and another not, then observing drowning rates, would show no causal effect.

Key Point: Experiments can establish causation, while observational studies can only suggest associations.

Principles of Experimental Design

Key Elements

Replication: Repeating an experiment on more than one individual to observe variability and ensure results are not due to chance.
Blinding: Keeping subjects (and sometimes experimenters) unaware of which treatment is being administered to prevent bias. Double-blind means both subjects and experimenters are unaware.
Randomness: Assigning subjects to groups using random selection to ensure comparability and reduce bias.

Sampling Methods

Simple Random Sample

Every possible sample of the same size has an equal chance of being selected.
Often considered the gold standard for sampling, but can be difficult to implement in practice.

Other Sampling Methods

Systematic Sampling: Select a starting point and then every kth member (e.g., every 50th student).
Convenience Sampling: Use data that are easy to obtain, but this method is prone to bias.
Stratified Sampling: Divide the population into subgroups (strata) with similar characteristics, then sample from each stratum.
Cluster Sampling: Divide the population into clusters, randomly select some clusters, and include all members from those clusters.

Comparison: Stratified vs. Cluster Sampling

Stratified Sampling: Ensures representation from each subgroup.
Cluster Sampling: More practical for large populations; entire clusters are sampled.

Sampling Example

Suppose you want to sample six students from your statistics class:

Simple random sample: Randomly select any six students.
Systematic sample: Select every kth student from a list.
Stratified sample: Divide by gender, then randomly select from each group.
Cluster sample: Divide class into groups, randomly select a group, and sample all its members.
Convenience sample: Choose the six students who are easiest to reach.

Types of Observational Studies

Definitions

Cross-sectional study: Data are collected at one point in time.
Retrospective (case-control) study: Data are collected from the past (e.g., records, interviews).
Prospective (longitudinal or cohort) study: Data are collected in the future from groups sharing common factors (cohorts).

Examples of Observational Studies

Nurses' Health Study: Ongoing prospective study of registered nurses.
Heart Health Study: Retrospective study comparing subjects with and without heart disease.
Marijuana Study: Cross-sectional survey of marijuana use in states with legalized use.
Framingham Heart Study: Ongoing prospective study focused on heart disease.

Confounding in Experiments

Confounding: Occurs when the effect of one variable cannot be separated from the effect of another, making it difficult to determine causality.

Types of Experimental Designs

Design Type	Description	Example
Completely Randomized	Subjects are randomly assigned to treatment groups.	Randomly assign students to receive a new teaching method or the standard method.
Randomized Block	Subjects are divided into blocks (groups) based on a characteristic, then randomly assigned treatments within each block.	Block by gender, then randomly assign treatments within each gender group.
Matched Pairs	Subjects are paired based on similarities, and each pair receives different treatments.	Measure blood pressure before and after a treatment in the same individuals.
Rigorously Controlled	Subjects are carefully assigned to groups to match important characteristics.	Assign subjects so that age, gender, and other factors are balanced between groups.

Sampling Errors and Bias

Definitions

Sampling error (random sampling error): The difference between a sample result and the true population result due to chance fluctuations.
Nonsampling error: Errors from human mistakes, such as data entry errors, biased questions, or inappropriate statistical methods.
Nonrandom sampling error: Errors from using a nonrandom sampling method, such as convenience sampling or voluntary response samples.

Key Formulas

Sample Mean:
Sample Proportion:
Sampling Error (for proportions):

Additional info: The above formulas are fundamental for analyzing sample data and estimating population parameters.