BackSample Surveys and Bias in Statistics: Study Guide
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Sample Surveys
Introduction to Sample Surveys
Sample surveys are a fundamental method in statistics for gathering information about a population by examining a subset, or sample, of individuals. This approach is essential when it is impractical or impossible to collect data from every member of the population.
Population: The entire group of individuals of interest.
Sample: A smaller group selected from the population for analysis.
Sample Survey: A study that collects data from a sample to infer information about the population.
Example: Opinion polls, health surveys, and environmental studies.
The Three Big Ideas of Sampling
Effective sampling relies on three core principles to ensure representativeness and minimize bias.
Examine a Part of the Whole: Sampling allows us to make inferences about the population without studying every individual.
Randomize: Random selection protects against known and unknown sources of bias, ensuring the sample reflects the population.
Sample Size: The precision of statistical estimates depends on the sample size, not the fraction of the population sampled.

Bias in Sampling
Understanding Bias
Bias occurs when a sampling method systematically over- or under-represents certain characteristics of the population. Avoiding bias is crucial, as biased samples cannot yield valid conclusions.
Types of Bias: Selection bias, measurement bias, response bias, nonresponse bias, voluntary response bias.
Prevention: Random selection is the best defense against bias.
Sampling Strategies
Simple Random Sampling (SRS)
Simple random sampling ensures every possible sample of the desired size has an equal chance of being selected. It is the gold standard for representativeness.
Sampling Frame: The list of individuals from which the sample is drawn.
Procedure: Assign numbers to individuals and use random numbers to select the sample.
Sampling Variability: Differences between samples due to random selection.
Stratified Sampling
Stratified sampling divides the population into homogeneous groups (strata) and selects a random sample from each stratum. This method increases precision and allows for subgroup analysis.
Benefits: Reduced sampling variability, more accurate estimates, flexibility in sampling methods.
Example: National surveys stratified by province.
Cluster and Multistage Sampling
Cluster sampling splits the population into clusters, selects clusters at random, and samples all or some individuals within selected clusters. Multistage sampling combines several methods, often used in large-scale surveys.
Cluster Sampling: Useful when stratification is impractical; clusters should represent the population.
Multistage Sampling: Involves multiple stages of random selection, increasing efficiency.

Systematic Sampling
Systematic sampling selects individuals at regular intervals from a list, starting from a randomly chosen point. It is efficient but requires assurance that the list order does not introduce bias.
Example: Surveying every 10th person on a list.
Justification: The method must not be associated with the variable of interest.
Populations, Parameters, and Statistics
Definitions and Notation
Statistical models use parameters to represent population characteristics. Sample statistics estimate these parameters.
Population Parameter: Key number describing the population (e.g., mean, proportion).
Sample Statistic: Summary measure from the sample used to estimate the parameter.
Notation: Greek letters for parameters, Latin letters for statistics.
Name | Statistic | Parameter |
|---|---|---|
Mean | ȳ | μ |
Standard deviation | s | σ |
Correlation | r | ρ |
Regression coefficient | b | β |
Proportion | p̂ | p |
Common Sampling Mistakes and Biases
Types of Sampling Mistakes
Several common errors can invalidate survey results by introducing bias.
Voluntary Response Sample: Individuals choose to participate, leading to voluntary response bias.
Convenience Sample: Sampling individuals who are easy to reach, often unrepresentative.
Bad Sampling Frame: Incomplete or inaccurate list of the population.
Undercoverage: Some population segments are not sampled or are underrepresented.
Nonresponse Bias: Selected individuals do not respond, and their characteristics differ from respondents.
Response Bias: Survey design or respondent behavior influences answers.
Comparison of Bias Types
The following table summarizes key differences between response bias, nonresponse bias, and voluntary response bias.
Dimension | Response Bias | Nonresponse Bias | Voluntary Response Bias |
|---|---|---|---|
Basic definition | Inaccurate or misleading answers | Selected individuals do not respond | Individuals choose whether to participate |
Who is involved | People who respond | People who do not respond | Only people who choose to respond |
Main cause | Survey design or respondent behavior | Missing responses | Self-selection |
Sampling frame | Usually well-defined | Well-defined, but incomplete | Poorly defined or unclear |
Key statistical problem | Responses do not reflect true values | Respondents differ from nonrespondents | Respondents differ from population |
Representative-ness | Sample may be representative, answers biased | Sample becomes unrepresentative | Sample is unrepresentative |
Typical reasons | Social desirability, sensitive questions, leading wording | Refusal, inaccessibility, lack of interest | Strong opinions, high motivation, personal stake |
Example | Underreporting illegal behavior | Busy students don’t respond | Online poll with strong opinions |
Effect on results | Measurement is biased | Estimates are biased | Results invalid for generalization |
Can occur with other biases? | Yes | Yes | Yes |
How to reduce | Anonymous surveys, neutral wording | Follow-ups, incentives | Very difficult; requires random sampling |
Type of bias | Measurement bias | Selection bias | Selection bias |

Best Practices for Valid Surveys
Designing a Valid Survey
To ensure survey results are valid and useful, follow these best practices:
Define Objectives: Clearly state what you want to know.
Use the Right Sampling Frame: Ensure the list covers the population of interest.
Tune Your Instrument: Ask specific, quantitative questions; avoid vague or leading wording.
Pilot Test: Test the survey with a small group to identify issues.
Report Methods: Always describe sampling methods in detail.
Summary of Key Concepts
Sampling allows inference about populations without studying every individual.
Randomization and sample size are critical for representativeness and precision.
Multiple sampling methods exist: SRS, stratified, cluster, systematic, multistage.
Biases can invalidate results; recognize and avoid voluntary response, convenience, bad frames, undercoverage, nonresponse, and response bias.
Use best practices in survey design for valid, reliable results.