Skip to main content
Back

Sample Surveys and Bias in Statistics: Study Guide

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Sample Surveys

Introduction to Sample Surveys

Sample surveys are a fundamental method in statistics for gathering information about a population by examining a subset, or sample, of individuals. This approach is essential when it is impractical or impossible to collect data from every member of the population.

  • Population: The entire group of individuals of interest.

  • Sample: A smaller group selected from the population for analysis.

  • Sample Survey: A study that collects data from a sample to infer information about the population.

  • Example: Opinion polls, health surveys, and environmental studies.

The Three Big Ideas of Sampling

Effective sampling relies on three core principles to ensure representativeness and minimize bias.

  • Examine a Part of the Whole: Sampling allows us to make inferences about the population without studying every individual.

  • Randomize: Random selection protects against known and unknown sources of bias, ensuring the sample reflects the population.

  • Sample Size: The precision of statistical estimates depends on the sample size, not the fraction of the population sampled.

Soup tasting analogy for sampling

Bias in Sampling

Understanding Bias

Bias occurs when a sampling method systematically over- or under-represents certain characteristics of the population. Avoiding bias is crucial, as biased samples cannot yield valid conclusions.

  • Types of Bias: Selection bias, measurement bias, response bias, nonresponse bias, voluntary response bias.

  • Prevention: Random selection is the best defense against bias.

Sampling Strategies

Simple Random Sampling (SRS)

Simple random sampling ensures every possible sample of the desired size has an equal chance of being selected. It is the gold standard for representativeness.

  • Sampling Frame: The list of individuals from which the sample is drawn.

  • Procedure: Assign numbers to individuals and use random numbers to select the sample.

  • Sampling Variability: Differences between samples due to random selection.

Stratified Sampling

Stratified sampling divides the population into homogeneous groups (strata) and selects a random sample from each stratum. This method increases precision and allows for subgroup analysis.

  • Benefits: Reduced sampling variability, more accurate estimates, flexibility in sampling methods.

  • Example: National surveys stratified by province.

Cluster and Multistage Sampling

Cluster sampling splits the population into clusters, selects clusters at random, and samples all or some individuals within selected clusters. Multistage sampling combines several methods, often used in large-scale surveys.

  • Cluster Sampling: Useful when stratification is impractical; clusters should represent the population.

  • Multistage Sampling: Involves multiple stages of random selection, increasing efficiency.

Stratified and cluster sampling diagram

Systematic Sampling

Systematic sampling selects individuals at regular intervals from a list, starting from a randomly chosen point. It is efficient but requires assurance that the list order does not introduce bias.

  • Example: Surveying every 10th person on a list.

  • Justification: The method must not be associated with the variable of interest.

Populations, Parameters, and Statistics

Definitions and Notation

Statistical models use parameters to represent population characteristics. Sample statistics estimate these parameters.

  • Population Parameter: Key number describing the population (e.g., mean, proportion).

  • Sample Statistic: Summary measure from the sample used to estimate the parameter.

  • Notation: Greek letters for parameters, Latin letters for statistics.

Name

Statistic

Parameter

Mean

ȳ

μ

Standard deviation

s

σ

Correlation

r

ρ

Regression coefficient

b

β

Proportion

p

Common Sampling Mistakes and Biases

Types of Sampling Mistakes

Several common errors can invalidate survey results by introducing bias.

  • Voluntary Response Sample: Individuals choose to participate, leading to voluntary response bias.

  • Convenience Sample: Sampling individuals who are easy to reach, often unrepresentative.

  • Bad Sampling Frame: Incomplete or inaccurate list of the population.

  • Undercoverage: Some population segments are not sampled or are underrepresented.

  • Nonresponse Bias: Selected individuals do not respond, and their characteristics differ from respondents.

  • Response Bias: Survey design or respondent behavior influences answers.

Comparison of Bias Types

The following table summarizes key differences between response bias, nonresponse bias, and voluntary response bias.

Dimension

Response Bias

Nonresponse Bias

Voluntary Response Bias

Basic definition

Inaccurate or misleading answers

Selected individuals do not respond

Individuals choose whether to participate

Who is involved

People who respond

People who do not respond

Only people who choose to respond

Main cause

Survey design or respondent behavior

Missing responses

Self-selection

Sampling frame

Usually well-defined

Well-defined, but incomplete

Poorly defined or unclear

Key statistical problem

Responses do not reflect true values

Respondents differ from nonrespondents

Respondents differ from population

Representative-ness

Sample may be representative, answers biased

Sample becomes unrepresentative

Sample is unrepresentative

Typical reasons

Social desirability, sensitive questions, leading wording

Refusal, inaccessibility, lack of interest

Strong opinions, high motivation, personal stake

Example

Underreporting illegal behavior

Busy students don’t respond

Online poll with strong opinions

Effect on results

Measurement is biased

Estimates are biased

Results invalid for generalization

Can occur with other biases?

Yes

Yes

Yes

How to reduce

Anonymous surveys, neutral wording

Follow-ups, incentives

Very difficult; requires random sampling

Type of bias

Measurement bias

Selection bias

Selection bias

Bias comparison table

Best Practices for Valid Surveys

Designing a Valid Survey

To ensure survey results are valid and useful, follow these best practices:

  • Define Objectives: Clearly state what you want to know.

  • Use the Right Sampling Frame: Ensure the list covers the population of interest.

  • Tune Your Instrument: Ask specific, quantitative questions; avoid vague or leading wording.

  • Pilot Test: Test the survey with a small group to identify issues.

  • Report Methods: Always describe sampling methods in detail.

Summary of Key Concepts

  • Sampling allows inference about populations without studying every individual.

  • Randomization and sample size are critical for representativeness and precision.

  • Multiple sampling methods exist: SRS, stratified, cluster, systematic, multistage.

  • Biases can invalidate results; recognize and avoid voluntary response, convenience, bad frames, undercoverage, nonresponse, and response bias.

  • Use best practices in survey design for valid, reliable results.

Pearson Logo

Study Prep