Chapter 10: Sample Surveys – The Three Big Ideas of Sampling

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Sample Surveys

Introduction to Sample Surveys

Sample surveys are a fundamental tool in statistics for learning about large populations by examining a smaller, manageable group. This chapter introduces the three essential ideas behind effective sampling: examining a part of the whole, randomization, and the importance of sample size.

The Three Big Ideas of Sampling

Examining a Part of the Whole

To understand a population, we often collect data from a smaller group called a sample. The goal is to use the sample to make inferences about the entire population.

Population: The entire group of individuals or instances about whom we hope to learn.
Sample: The subset of the population from whom we actually collect data.
It is usually impractical or impossible to collect data from the entire population.
Examples of samples include telephone surveys, internet surveys, medical studies, and experiments such as crash dummy tests or weather studies.

Bias in Sampling

A major challenge in sampling is obtaining a sample that is representative of the population. Bias occurs when some characteristics of the population are over- or under-emphasized in the sample, leading to flawed results.

Bias: Systematic error that results in an unrepresentative sample.
Examples: Overlooking subgroups (e.g., the homeless in surveys), or favoring others (e.g., internet users in online surveys).
Samples with bias cannot be trusted for generalizing to the population.
Once bias is present, it cannot be corrected after the sample is drawn.

Sampling Bias

Sampling bias arises from the method of selecting the sample, causing it to differ from the population in relevant ways. This undermines the validity of any generalizations.

Bias vs. Variability

Bias refers to systematic errors, while variability refers to the natural differences that occur from sample to sample. The ideal sample has low bias and low variability.

Variability: The extent to which sample results differ from one another.

Targets illustrating bias and variability

Historical Example: The Literary Digest Poll

In the 1936 U.S. presidential campaign, the Literary Digest conducted a massive mail-in survey using phone book listings. The survey predicted Landon would win, but Roosevelt actually won by a large margin. The survey was biased because it underrepresented low-income voters who did not have phones.

Lesson: Even large samples can be misleading if they are not representative.

Literary Digest poll predicting Landon over Roosevelt

Randomization

Randomization is a key principle in sampling. By selecting individuals at random, we protect against both known and unknown sources of bias, increasing the likelihood that the sample represents the population.

Randomization: The use of chance to select a sample, ensuring that every individual has an equal chance of being chosen.
Random samples are more likely to be representative and free of bias.
It is impossible to match the sample to the population on every characteristic, so randomization is essential.
Pollsters often use random digit dialing and other random selection methods to choose participants.

Soup ladle as analogy for sampling

Sample Size

The size of the sample, not the size of the population, determines the accuracy of the results. Larger samples tend to have less variability, but the population size does not dictate the necessary sample size for reliable estimates.

Sample Size (n): The number of individuals in the sample.
For a given level of accuracy, the required sample size is the same regardless of the population size (e.g., 100 students at a university vs. 100 Americans nationwide).
Larger samples reduce variability but do not eliminate bias.

Soup ladle as analogy for sampling

Census vs. Sample

A census attempts to include every individual in the population. While this may seem ideal, it is often impractical due to cost, time, and the dynamic nature of populations.

Census: A survey that attempts to include the entire population.
Problems: Difficult to complete, expensive, populations change over time, and some groups are hard to reach (e.g., the homeless).
Sampling is usually more practical and efficient.

Summary Table: Key Concepts in Sampling

Concept	Definition	Importance
Population	Entire group of interest	Target for inference
Sample	Subset of the population	Used to estimate population characteristics
Bias	Systematic error in sampling	Leads to untrustworthy results
Randomization	Selection by chance	Reduces bias, increases representativeness
Sample Size (n)	Number of individuals in sample	Determines accuracy, not population size
Census	Survey of entire population	Often impractical

Key Takeaways

Sampling allows us to learn about populations efficiently and effectively.
Avoiding bias and using randomization are essential for trustworthy results.
Sample size, not population size, determines the reliability of estimates.
Censuses are rarely practical; well-designed samples are preferred.