Introduction to Statistics: Data Collection and Experimental Design

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Section 1.3: Data Collection and Experimental Design

Objectives of Section 1.3

This section introduces foundational concepts in designing statistical studies, distinguishing between observational studies and experiments, and various data collection and sampling techniques. Mastery of these topics is essential for conducting valid and reliable statistical research.

Designing a statistical study and distinguishing between observational studies and experiments
Data collection using surveys and simulations
Designing experiments with proper controls
Sampling methods: random, simple random, stratified, cluster, systematic, and identifying biased samples

Designing a Statistical Study

Effective statistical studies follow a structured process to ensure valid results and minimize errors.

Identify the variable(s) of interest and the population to be studied.
Develop a detailed plan for data collection, ensuring the sample is representative of the population.
Collect the data using appropriate methods.
Describe the data using descriptive statistics techniques.
Interpret the data and make decisions about the population using inferential statistics.
Identify any possible errors that may affect the study's validity.

Types of Data Collection

Observational Study

In an observational study, researchers observe and measure characteristics of interest without influencing the subjects.

Definition: A researcher observes and measures characteristics of interest of part of a population.
Example: Measuring the amount of time people spend on activities such as paid work, childcare, and socializing. (Source: U.S. Bureau of Labor Statistics)

Experiment

Experiments involve applying a treatment to part of a population and observing the effects.

Treatment group: Receives the treatment.
Control group: Does not receive the treatment; may receive a placebo.
Experimental units: Subjects in both groups.
Placebo: A harmless, fake treatment used to mimic the real treatment.
Example: Overweight subjects given sucralose vs. water; researchers measured glycemic and insulin responses. (Source: Diabetes Care)

Simulation

Simulations use mathematical or physical models to reproduce real-world conditions, often with computers.

Useful for studying situations that are impractical or dangerous to replicate in reality.
Can save time and money.
Example: Automobile manufacturers use crash simulations with dummies.

Survey

Surveys investigate characteristics of a population by asking questions.

Commonly conducted via interview, Internet, phone, or mail.
Question wording is crucial to avoid bias.
Example: Surveying physicians about career choice motivations.

Experimental Design

Well-designed experiments require control, randomization, and replication to ensure validity.

Control: Managing variables to isolate the effect of the treatment.
Randomization: Randomly assigning subjects to treatment groups.
Replication: Repeating the experiment with a large group of subjects.

Confounding Variables

Confounding occurs when the effects of multiple factors on a variable cannot be distinguished.

Example: A coffee shop remodels while a nearby mall opens; increased business cannot be attributed to one factor.

Placebo Effect and Blinding

Placebo effect: Subjects respond favorably to a placebo.
Blinding: Subjects do not know if they are receiving treatment or placebo.
Double-blind: Neither subjects nor experimenters know who receives treatment or placebo.

Randomization Techniques

Completely randomized design: Subjects assigned to groups by random selection.
Randomized block design: Subjects divided into blocks by characteristics, then randomly assigned within blocks.
Matched-pairs design: Subjects paired by similarity; each pair receives different treatments.

Sample Size and Replication

Sample size: Larger samples increase validity and reliability.
Replication: Repeating experiments with large groups to confirm results.

Sampling Techniques

Sampling methods are used to select a subset of a population for study.

Census: Measures the entire population.
Sampling: Measures part of the population.
Sampling error: Difference between sample results and population results.

Types of Sampling

Random Sample: Every member has an equal chance of selection.
Simple Random Sample: Every possible sample of the same size has an equal chance of selection.
Stratified Sample: Population divided into strata; random sample taken from each stratum.
Cluster Sample: Population divided into clusters; all members of selected clusters are included.
Systematic Sample: Select every kth member after a random start.
Convenience Sample: Select members who are easiest to reach; often leads to bias.

Example Table: Sampling Techniques Comparison

Sampling Technique	Description	Example
Simple Random	Equal chance for all members	Randomly select students from a list
Stratified	Divide into strata, sample from each	Sample students by major
Cluster	Divide into clusters, select all in some clusters	Sample all students in selected zip codes
Systematic	Select every kth member	Every 10th student on a list
Convenience	Easy-to-reach members	Sample students in your class

Key Formulas and Concepts

Sampling Error:
Random Selection: Use random number tables or generators to ensure unbiased selection.

Examples and Applications

Observational Study: Surveying economic confidence without influencing responses.
Experiment: Testing effects of vitamin D3 supplementation with a control group.
Simulation: Crash tests using dummies to model real accidents.
Survey: Questioning physicians about career motivations.

Additional info: These notes expand on the provided slides with definitions, examples, and a comparison table for sampling techniques to ensure a comprehensive understanding for exam preparation.