Experiments and Observational Studies: Design, Comparison, and Interpretation

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 10: Experiments and Observational Studies

Observational Studies vs. Experiments

In statistics, understanding the distinction between observational studies and experiments is crucial for interpreting results and establishing causality. Both study types are foundational for statistical inference, but they differ in design, control, and the strength of conclusions that can be drawn.

Observational Study: Researchers observe subjects and measure variables of interest without assigning treatments or interventions. The goal is to identify associations between variables.
Experiment: Researchers actively assign treatments to subjects to study the effect of interventions on a response variable. This design allows for stronger causal inferences.

Types of Observational Studies

Retrospective Study: Subjects are identified based on their exposure status, and past outcomes or characteristics are collected. Example: Identifying students who did or did not take music lessons and then retrieving their past school marks.
Prospective Study: Subjects are followed over time to observe future outcomes. Example: Following students over several years to see if taking music lessons affects their future academic performance.

Key Terms and Concepts

Explanatory Variable (Factor): The variable that is manipulated or categorized to observe its effect on the response variable (e.g., music learning, drug type).
Response Variable: The outcome measured in the study (e.g., school performance, blood pressure).
Control Group: The group that does not receive the treatment or receives a standard treatment, serving as a baseline for comparison.

Establishing Causality: Association vs. Causation

Observational studies can reveal associations but cannot establish causation due to potential confounding variables. Experiments, especially those with random assignment, can provide evidence for causal relationships.

Confounding Variable: An extraneous variable that is related to both the explanatory and response variables, making it difficult to separate their effects.
Lurking Variable: A variable not included in the study that influences both the explanatory and response variables, creating a spurious association.

Example Table: Confounding in Observational Studies

	Students taking music lessons	Students not taking music lessons
Parents' income level	Higher	Lower
% having a private tutor	Higher	Lower
% coming from single-parent families	Lower	Higher

Additional info: This table illustrates how confounding variables (like socioeconomic status) can bias the observed association between music lessons and school performance.

Randomized, Comparative Experiments

Randomized experiments are designed to assess the effect of one or more factors on a response variable by randomly assigning treatments to subjects. This process helps balance out confounding variables and supports causal inference.

Factor: The explanatory variable manipulated in the experiment (e.g., drug type).
Level: A specific value or category of a factor (e.g., new drug, existing drug).
Subjects/Experimental Units: The individuals or items receiving treatments.

Principles of Experimental Design

Randomization: Randomly assign treatments to experimental units to average out the effects of extraneous variables.
Replication: Use enough subjects in each treatment group to ensure reliable results. Replicating the experiment in different settings increases generalizability.
Blocking: Group experimental units by a variable (block) that may affect the response, then randomize treatments within each block. Example: Blocking by gender in a drug trial to control for gender effects.

Diagram: Randomized Experiment Example

Hypertensive patients (100 patients)
Randomization
Treatment 1: New drug group (50 patients)	Treatment 2: Existing drug group (50 patients, control group)
Compare reduction in blood pressure between treatment groups

Types of Experimental Designs

Completely Randomized Design: Subjects are randomly assigned to treatment groups without blocking.
Randomized Block Design: Subjects are first grouped into blocks (e.g., by gender), then randomized to treatments within each block.
Matched Pairs Design: Subjects are paired based on characteristics, and each pair is split between treatments, often used when subjects can be closely matched (e.g., by age).

Blinding and Placebo

Blinding: Concealing the treatment assignment from subjects, evaluators, or both to prevent bias.
Single-Blind: Either the subjects or the evaluators are unaware of the treatment assignments.
Double-Blind: Both subjects and evaluators are unaware of the treatment assignments.
Placebo: An inert treatment given to the control group to mimic the experience of the treatment group and control for the placebo effect.
Placebo Effect: A change in the response variable due to subjects' belief in the treatment rather than the treatment itself.

Statistical Significance and Decision Making

When analyzing experimental results, it is important to determine whether observed differences are due to chance or reflect a true effect of the treatment. If the difference is greater than expected from sampling variability, it is considered statistically significant.

Statistical Significance: Evidence that the observed effect is unlikely to be due to chance alone.
Boxplots: Useful for comparing the distribution of quantitative response variables across treatment groups, considering both center and spread.

Example: Boxplot Comparison

Suppose two boxplots compare final grades for web-based and traditional teaching methods. A large, consistent difference in medians and little overlap in interquartile ranges suggests a real effect, while substantial overlap suggests the difference may be due to chance.

Confounding and Lurking Variables

Lurking Variable: A variable not included in the study that affects both the explanatory and response variables, creating a spurious association.
Confounding Variable: A variable that is related to both the explanatory and response variables, making it impossible to distinguish their separate effects on the response.

Example Table: Lurking vs. Confounding Variables

Type	Definition	Example
Lurking Variable	Not measured in the study; affects both explanatory and response variables	Size of fire affects both number of firefighters and amount of damage
Confounding Variable	Measured or unmeasured; associated with both explanatory and response variables, making effects indistinguishable	Private tutoring confounds the relationship between music lessons and school performance

Additional info: Understanding the difference between lurking and confounding variables is essential for proper study design and interpretation of results.