Skip to main content
Back

Chapter 1: Collecting Data – Sampling Methods in Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 1: Collecting Data

Part A: Sampling Methods

This section introduces foundational concepts in statistics related to collecting data, focusing on sampling methods and the variables involved in statistical studies. Understanding these concepts is essential for designing valid studies and interpreting results accurately.

Main Concepts and Definitions

  • Population: The entire set of individuals or items of interest in a study.

  • Parameter: A numerical characteristic of a population, such as the mean () or proportion ().

  • Sample: A subset of the population selected for analysis.

  • Statistic: A numerical characteristic calculated from a sample, such as the sample mean () or sample proportion ().

  • Sample Frame: The list or set of all individuals in the population who have a chance of being included in the sample.

  • Sampling Method: The procedure used to select the sample from the population.

  • Explanatory Variable: The independent variable (often denoted as ) that is hypothesized to influence the response variable.

  • Response Variable: The dependent variable (often denoted as ) that is measured as the outcome of interest.

  • Confounding Variable: A variable that may affect the response variable but is not the explanatory variable; it can obscure the relationship between explanatory and response variables.

  • Research Question: The central question the study aims to answer, typically involving the relationship between variables or the estimation of a parameter.

Example: Smoking and Lung Capacity

Consider the research question: "Does smoking affect lung capacity?" or "Is smoking associated with lower lung capacity?" In this context:

  • Explanatory Variable: Smoking status

  • Response Variable: Lung capacity

  • Confounding Variables: Age, Gender, Lifestyle

Confounding variables such as age, gender, and lifestyle may influence lung capacity independently of smoking status, making it important to account for them in the study design.

Exercise 1: Sampling Plan Analysis

This exercise explores a real-world application of sampling methods in a social justice context. Students are asked to analyze a survey plan investigating the health differences between wealthy and low-income individuals in the USA.

  • a) Research Question: "Are wealthy people in the USA healthier than lower income people in the USA?"

  • b) Explanatory and Response Variables:

    • Explanatory Variable: Income level (wealthy vs. low-income)

    • Response Variable: Health status (as measured by the survey)

  • c) Possible Confounding Variables: Age, gender, access to healthcare, education, geographic location, lifestyle choices

  • d) Sampling Method and Bias:

    • This plan uses a convenience sample, where participants are selected based on ease of access (mall visitors).

    • Potential Bias: Convenience sampling may not represent the broader population accurately. For example, people who visit malls may differ systematically from those who do not (e.g., in age, mobility, socioeconomic status).

    • Over/Under Representation: Certain groups (such as those who do not frequent malls, or those from rural areas) may be underrepresented, while urban or mobile individuals may be overrepresented.

Types of Sampling Methods

Sampling methods are crucial for obtaining representative data. Common types include:

  • Simple Random Sampling: Every member of the population has an equal chance of being selected.

  • Stratified Sampling: The population is divided into subgroups (strata), and samples are taken from each stratum.

  • Cluster Sampling: The population is divided into clusters, some clusters are randomly selected, and all individuals within chosen clusters are sampled.

  • Systematic Sampling: Every th individual is selected from a list of the population.

  • Convenience Sampling: Individuals are selected based on ease of access, which may introduce bias.

Comparison of Sampling Methods

Sampling Method

Description

Potential Bias

Simple Random

Each individual has equal chance of selection

Low (if properly implemented)

Stratified

Population divided into strata, samples from each

Low (if strata are well-defined)

Cluster

Population divided into clusters, clusters sampled

Moderate (depends on cluster similarity)

Systematic

Every th individual selected

Low to moderate (if list is random)

Convenience

Sample taken from easily accessible individuals

High (may not represent population)

Key Formulas

  • Sample Mean:

  • Population Mean:

  • Sample Proportion:

  • Population Proportion:

Summary

  • Proper sampling methods are essential for valid statistical inference.

  • Convenience samples are easy to collect but often introduce bias.

  • Identifying explanatory, response, and confounding variables is crucial for study design.

  • Research questions should be clearly defined and guide the selection of variables and sampling methods.

Additional info: Academic context and expanded definitions have been added to ensure completeness and clarity for exam preparation.

Pearson Logo

Study Prep