BackDesigning Observational Studies and Experiments: Simple Random Sampling and Bias
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Designing Observational Studies and Experiments (2.1)
Introduction
This chapter introduces foundational concepts in statistics related to designing studies, collecting data, and understanding the role of randomness and bias in sampling. These principles are essential for conducting valid statistical investigations and making reliable inferences about populations.
Variables, Individuals, and Observations
Definitions
Variable: A characteristic of individuals to be measured or observed in a study.
Observation: The data value recorded for a variable from an individual.
Example: Identifying Individuals, Variables, and Observations
Consider a table of the five movies with the largest worldwide gross receipts:
Movie | Studio | WW Gross Receipts ($millions) | U.S. Gross Receipts ($millions) |
|---|---|---|---|
Avatar | Fox | 2788 | 761 |
Titanic | Paramount Pictures | 2188 | 659 |
Star Wars: The Force Awakens | Buena Vista | 2068 | 937 |
Avengers: Infinity War | Buena Vista | 2049 | 679 |
Jurassic World | Universal | 1672 | 652 |
Individuals: The movies listed (e.g., Avatar, Titanic, etc.).
Variables: Studio, worldwide gross receipts, U.S. gross receipts.
Observations: For each variable, the corresponding data values (e.g., studios: Fox, Paramount Pictures, etc.; worldwide gross receipts: 2788, 2188, etc.).
The Statistical Process
Five Steps in Statistics
Raise a precise question about one or more variables.
Create a plan to answer the question, ensuring meaningful results.
Collect the data through observation, measurement, or surveys.
Analyze the data using tables, graphs, and calculations to identify patterns.
Draw a conclusion about the question, often leading to further research.
Populations, Samples, and Sampling
Key Definitions
Population: The entire group of individuals about which we want to learn.
Sample: The subset of the population from which data are collected.
Sampling: The process of selecting a sample from the population.
Example: Identifying Variable, Sample, and Population
Variable: Same-sex marriage (responses to legality question).
Sample: 1024 surveyed American adults.
Population: All American adults.
Statistics, Parameters, and Types of Statistical Practice
Definitions
Statistic: A numerical summary of a sample.
Parameter: A numerical summary of a population.
Descriptive Statistics: Using tables, graphs, and calculations to describe a sample.
Inferential Statistics: Using sample information to draw conclusions about a population (inferences).
Example: Labrador Retriever Diet Study
Research Question: How does a restricted diet affect a Labrador Retriever's life-span?
Population: All Labrador Retrievers.
Sample: 48 Labrador Retrievers (24 on normal diet, 24 on restricted diet).
Conclusion: Restricted diet tends to increase life-span (an inference about the population based on the sample).
Simple Random Sampling
Definition and Methods
Simple Random Sampling: Every sample of size has the same chance of being chosen.
Sampling with Replacement: Individuals can be selected more than once.
Sampling without Replacement: Individuals cannot be selected more than once.
Example: Estimating a Population Proportion
Population Proportion: (parameter).
Sample Proportion: (statistic).
Inferential Statistics: Using to estimate .
Sampling Error and Bias
Definitions
Sampling Error: The error from using a sample to estimate a population parameter due to random variation.
Bias: Systematic error from a sampling method that consistently under- or overemphasizes certain characteristics.
Types of Bias
Sampling Bias: Some members of the population are more likely to be included than others.
Nonresponse Bias: Individuals selected for the sample do not respond.
Response Bias: Survey responses are inaccurate due to question wording or respondent behavior.
Guidelines for Constructing Survey Questions
Avoid judgmental words.
Avoid yes/no questions.
Switch the order of choices for different respondents.
Address only one issue per question.
Examples of Bias
Sampling Bias: Surveying only students in the library favors those who study more.
Nonresponse Bias: Students busy studying may refuse to participate.
Response Bias: Students may exaggerate their study habits.
Compound Bias: A conservative news station's call-in survey on SNAP funding is biased by audience and question wording.
Nonsampling Error
Definition
Nonsampling Error: Errors from biased sampling, incorrect data recording, or incorrect data analysis.
Examples of Sampling and Nonsampling Errors
Sampling Error Only: Random sample, accurate data, correct analysis; difference between sample and population is due to chance.
Nonsampling Error: Response bias (e.g., employees not admitting to calling in sick when healthy) leads to inaccurate results.
Summary Table: Key Terms and Examples
Term | Definition | Example |
|---|---|---|
Population | Entire group of interest | All American adults |
Sample | Subset of the population | 1024 surveyed adults |
Statistic | Numerical summary of a sample | |
Parameter | Numerical summary of a population | |
Sampling Error | Random error from sampling | Sample proportion differs from population proportion |
Bias | Systematic error in sampling | Surveying only library students |
Nonsampling Error | Error from data collection/analysis | Incorrectly recorded responses |
Additional info: This summary covers the essential concepts of designing studies, sampling, and recognizing errors and bias, as presented in the provided slides. These are foundational for understanding how to collect and interpret data in statistics.