Skip to main content
Back

Chapter 9: Sample Surveys – Principles, Methods, and Biases

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Sample Surveys

Why Do We Need Sampling?

In statistics, sampling is essential for studying large populations efficiently. A population is the complete collection of individuals under study, while a census attempts to gather data from every member of the population. However, censuses are often costly, time-consuming, or impractical. Instead, we study a sample, a subset of individuals selected from the population, to make inferences about the whole.

  • Population: The entire group of interest.

  • Sample: A subset of the population, ideally representative.

  • Parameter: A numerical summary describing a population (e.g., mean tuition of all students).

  • Statistic: A numerical summary describing a sample (e.g., mean tuition of sampled students).

  • Population parameter: A parameter that is part of a model for the population.

  • Statistics as estimates: We use statistics to estimate population parameters.

Example: To estimate how much UBC students pay for tuition, we might interview 500 students. Here, the population is all UBC students, the sample is the 500 interviewed, the parameter is the true mean tuition for all students, and the statistic is the mean tuition from the sample.

Randomization – The Key to Obtaining a Representative Sample

Randomization is crucial in sampling and inference. It helps ensure that the sample reflects the population's characteristics, minimizing bias.

  • Randomization: Assigns each individual an equal chance of selection, reducing systematic differences.

  • Representative sample: A sample whose characteristics closely match those of the population.

  • Bias: Systematic deviation from the true population parameter due to non-representative sampling.

It Is the Sample Size That Matters

The reliability of a sample depends on its size, not the size of the population or the fraction sampled. Larger samples tend to yield more reliable estimates, but only if the sample is representative.

  • Sample size: The number of individuals in the sample. Larger sizes reduce sampling variability.

  • Sampling variability: The natural variation in sample statistics from sample to sample.

  • Key point: A large but biased sample is still unreliable.

How to Sample?

Key Definitions

  • Sampling frame: The list of individuals from which the sample is drawn. Must accurately reflect the population.

  • Sampling variability: Differences in sample statistics due to random selection. Larger samples reduce this variability.

Example: Drawing two different samples from the same population will likely yield different results due to sampling variability.

Sampling Methods

  • Simple Random Sampling (SRS): Each individual and each possible sample of size has an equal chance of being selected.

  • Stratified Sampling: The population is divided into strata (groups sharing a characteristic), and SRS is performed within each stratum. Results are combined for analysis.

  • Proportional Allocation: The size of each SRS is proportional to the size of the stratum in the population.

  • Cluster Sampling: The population is divided into clusters (natural groupings). A random sample of clusters is selected, and all individuals in chosen clusters are sampled (one-stage), or a further SRS is performed within clusters (two-stage).

  • Multistage Sampling: Combines multiple sampling methods or stages, such as two-stage cluster sampling.

  • Systematic Sampling: Selects every th individual from the sampling frame. Effective if the list has no hidden order.

Sampling Methods Comparison Table

Method

Description

Advantages

Disadvantages

Simple Random Sampling (SRS)

Randomly select individuals; each has equal chance

Unbiased, easy to analyze

May be impractical for large populations

Stratified Sampling

Divide into strata, sample within each

Reduces variability, ensures representation

Requires knowledge of strata

Cluster Sampling

Divide into clusters, sample clusters

Cost-efficient, practical

May increase variability if clusters are heterogeneous

Systematic Sampling

Select every th individual

Simple, quick

Risk of bias if list is ordered

Multistage Sampling

Combine multiple methods/stages

Flexible, practical for large populations

Complex to design and analyze

Bad Sampling Procedures, Biases, and More

Sampling must be carefully designed to avoid bias. Common sources of bias include:

  • Undercoverage: Some groups are excluded or underrepresented in the sampling frame.

  • Convenience Sampling: Individuals are selected based on ease of access, not randomness.

  • Voluntary Response Bias: Individuals with strong opinions are more likely to participate, skewing results.

  • Nonresponse Bias: Those who do not respond may differ systematically from respondents.

  • Response Bias: Survey responses are influenced by question wording, misunderstanding, or reluctance to answer truthfully.

Types of Bias Table

Type of Bias

Description

Example

Undercoverage

Excludes certain groups from sampling frame

Surveying only library visitors to estimate student library use

Convenience Sampling

Samples based on accessibility

Surveying neighbors for housing prices

Voluntary Response Bias

Participants self-select, often with strong opinions

Call-in polls

Nonresponse Bias

Non-respondents differ from respondents

Mail-in questionnaires

Response Bias

Responses influenced by question phrasing or reluctance

Surveying about sensitive behaviors (e.g., impaired driving)

Key Formulas and Concepts

  • Sample Mean:

  • Population Mean:

  • Sampling Variability:

Additional info: These notes expand on the definitions and examples provided in the original slides, adding context for bias types and sampling methods, and including key formulas for statistical estimation.

Pearson Logo

Study Prep