BackSampling Methods and Experimental Design in Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Sampling Methods and Experimental Design
Introduction
When analyzing sample data, it is essential to use an appropriate method for collecting those data. This section covers key concepts in sampling methods, experimental design, and types of observational studies, which are foundational for statistical inference.
Placebos and Experimental Design
Definition of Placebo
Placebo: A harmless and ineffective pill, medicine, or procedure sometimes used for psychological benefit or as a control in experiments. Placebos are used by researchers for comparison to other treatments.
Example: The Salk Vaccine Experiment
In 1954, an experiment was conducted to test the effectiveness of the Salk vaccine in preventing polio.
Design: 401,974 children were randomly assigned to two groups:
Treatment group: 200,745 children received the Salk vaccine.
Placebo group: 201,229 children received a placebo (no active drug).
Assignment was done by random selection, ensuring groups were equivalent.
Results: 33 children in the vaccine group and 115 in the placebo group developed paralytic polio.
Conclusion: The experiment demonstrated the effectiveness of the Salk vaccine, as fewer cases of polio occurred in the treatment group.
Key Terms in Experiments
Experiment: A study in which a treatment is applied to individuals (called experimental units or subjects), and the effects are observed.
Observational Study: A study where characteristics are observed and measured without assigning treatments.
Observational Studies vs. Experiments
Example: Ice Cream and Drownings
Observational Study: Observing past data may show a correlation between ice cream sales and drownings, but this is due to a lurking variable (temperature), not causation.
Experiment: Assigning one group to eat ice cream and another not, then observing drowning rates, would show no causal effect.
Key Point: Experiments can establish causation, while observational studies can only suggest associations.
Principles of Experimental Design
Key Elements
Replication: Repeating an experiment on more than one individual to observe variability and ensure results are not due to chance.
Blinding: Keeping subjects (and sometimes experimenters) unaware of which treatment is being administered to prevent bias. Double-blind means both subjects and experimenters are unaware.
Randomness: Assigning subjects to groups using random selection to ensure comparability and reduce bias.
Sampling Methods
Simple Random Sample
Every possible sample of the same size has an equal chance of being selected.
Often considered the gold standard for sampling, but can be difficult to implement in practice.
Other Sampling Methods
Systematic Sampling: Select a starting point and then every kth member (e.g., every 50th student).
Convenience Sampling: Use data that are easy to obtain, but this method is prone to bias.
Stratified Sampling: Divide the population into subgroups (strata) with similar characteristics, then sample from each stratum.
Cluster Sampling: Divide the population into clusters, randomly select some clusters, and include all members from those clusters.
Comparison: Stratified vs. Cluster Sampling
Stratified Sampling: Ensures representation from each subgroup.
Cluster Sampling: More practical for large populations; entire clusters are sampled.
Sampling Example
Suppose you want to sample six students from your statistics class:
Simple random sample: Randomly select any six students.
Systematic sample: Select every kth student from a list.
Stratified sample: Divide by gender, then randomly select from each group.
Cluster sample: Divide class into groups, randomly select a group, and sample all its members.
Convenience sample: Choose the six students who are easiest to reach.
Types of Observational Studies
Definitions
Cross-sectional study: Data are collected at one point in time.
Retrospective (case-control) study: Data are collected from the past (e.g., records, interviews).
Prospective (longitudinal or cohort) study: Data are collected in the future from groups sharing common factors (cohorts).
Examples of Observational Studies
Nurses' Health Study: Ongoing prospective study of registered nurses.
Heart Health Study: Retrospective study comparing subjects with and without heart disease.
Marijuana Study: Cross-sectional survey of marijuana use in states with legalized use.
Framingham Heart Study: Ongoing prospective study focused on heart disease.
Confounding in Experiments
Confounding: Occurs when the effect of one variable cannot be separated from the effect of another, making it difficult to determine causality.
Types of Experimental Designs
Design Type | Description | Example |
|---|---|---|
Completely Randomized | Subjects are randomly assigned to treatment groups. | Randomly assign students to receive a new teaching method or the standard method. |
Randomized Block | Subjects are divided into blocks (groups) based on a characteristic, then randomly assigned treatments within each block. | Block by gender, then randomly assign treatments within each gender group. |
Matched Pairs | Subjects are paired based on similarities, and each pair receives different treatments. | Measure blood pressure before and after a treatment in the same individuals. |
Rigorously Controlled | Subjects are carefully assigned to groups to match important characteristics. | Assign subjects so that age, gender, and other factors are balanced between groups. |
Sampling Errors and Bias
Definitions
Sampling error (random sampling error): The difference between a sample result and the true population result due to chance fluctuations.
Nonsampling error: Errors from human mistakes, such as data entry errors, biased questions, or inappropriate statistical methods.
Nonrandom sampling error: Errors from using a nonrandom sampling method, such as convenience sampling or voluntary response samples.
Key Formulas
Sample Mean:
Sample Proportion:
Sampling Error (for proportions):
Additional info: The above formulas are fundamental for analyzing sample data and estimating population parameters.