Skip to main content
Back

Chapter 1: Data Collection – Foundations of Statistical Practice

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Statistics and Data Collection

Define Statistics and Statistical Thinking

Statistics is the science of gathering, organizing, analyzing, and interpreting information to draw conclusions or answer questions. Statistical thinking involves understanding the process of data collection and recognizing the presence of variability in data.

  • Statistics: The science of collecting, organizing, analyzing, and interpreting data to make decisions.

  • Statistical Thinking: Involves considering the context, source, and variability of data.

  • Data: A fact or proposition used to draw a conclusion or make a decision.

Example: Measuring the average number of hours students sleep per night to determine if there is a relationship between sleep and academic performance.

Explain the Process of Statistics

The process of statistics involves several key steps to ensure valid and reliable results:

  1. Identify the research objective.

  2. Collect the data needed to answer the question.

  3. Describe the data.

  4. Draw conclusions from the data.

Population: The entire group of individuals to be studied. Sample: A subset of the population selected for study. Parameter: A numerical summary of a population. Statistic: A numerical summary of a sample.

Example: Estimating the proportion of all students at a university who have a job.

Types of Variables and Data

Distinguish between Qualitative and Quantitative Variables

Variables are characteristics of individuals within the population. They can be classified as:

  • Qualitative (Categorical) Variables: Describe an individual by placing them into a category or group (e.g., gender, eye color).

  • Quantitative Variables: Provide numerical measures of individuals (e.g., height, age).

Example: Classifying 'number of siblings' as quantitative and 'type of car owned' as qualitative.

Distinguish between Discrete and Continuous Variables

  • Discrete Variable: Has a finite or countable number of possible values (e.g., number of students in a class).

  • Continuous Variable: Has an infinite number of possible values, often measured rather than counted (e.g., height, weight).

Example: Income (in dollars) is discrete if measured in whole dollars, but continuous if measured to the cent.

Determine the Level of Measurement of a Variable

  • Nominal Level: Values are names, labels, or categories (no order).

  • Ordinal Level: Values can be ranked or ordered, but differences are not meaningful.

  • Interval Level: Differences between values are meaningful, but there is no true zero (e.g., temperature in Celsius).

  • Ratio Level: Differences and ratios are meaningful, and there is a true zero (e.g., height, weight).

Example: Grade earned in Algebra (as a percentage) is at the ratio level of measurement.

Observational Studies and Experiments

Distinguish between an Observational Study and an Experiment

  • Observational Study: Measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.

  • Experiment: Researcher assigns individuals to groups, intentionally manipulates an explanatory variable, and observes the effect on the response variable.

Example: Studying the effect of diet on energy by observing eating habits (observational) versus assigning diets (experiment).

Explain the Various Types of Observational Studies

  • Cross-sectional Studies: Collect information at a specific point in time.

  • Case-control Studies: Retrospective; individuals with a certain characteristic are matched with those without.

  • Cohort Studies: Prospective; a group is observed over a long period.

Sampling Methods

Obtain a Simple Random Sample

Simple random sampling ensures every possible sample of a given size has an equally likely chance of being chosen.

  1. Number the individuals in the population.

  2. Use a random number generator or table to select the sample.

Other Effective Sampling Methods

  • Stratified Sampling: Population divided into non-overlapping groups (strata), then a simple random sample is taken from each stratum.

  • Systematic Sampling: Select every kth individual from a population list.

  • Cluster Sampling: Select all individuals within a randomly selected collection or group (cluster).

  • Convenience Sampling: Individuals are easily obtained and not based on randomness (often leads to bias).

Bias in Sampling

Explain the Sources of Bias in Sampling

  • Sampling Bias: Technique used to obtain the sample tends to favor one part of the population.

  • Nonresponse Bias: Individuals selected do not respond to the survey.

  • Response Bias: Answers do not reflect the true feelings of the respondent (e.g., due to wording of questions).

Design of Experiments

Describe the Characteristics of an Experiment

  • Experimental Unit (Subject): The person, object, or item being studied.

  • Treatment: Any combination of values of the factors (explanatory variables).

  • Control Group: Used as a baseline for comparison.

  • Placebo: A harmless, inactive treatment given to the control group.

  • Blinding: Single-blind (subjects do not know treatment), double-blind (subjects and researchers do not know treatment).

Explain the Steps in Designing an Experiment

  1. Identify the problem to be solved.

  2. Determine the factors that affect the response variable.

  3. Determine the number of experimental units.

  4. Determine the level of each factor.

  5. Conduct the experiment.

  6. Test the claim.

Explain the Completely Randomized Design

Each experimental unit is randomly assigned to a treatment group.

Explain the Matched-Pairs Design

Experimental units are paired up, and each pair receives both treatments in a random order, or the pairs are matched as closely as possible.

Explain the Randomized Block Design

Experimental units are divided into homogeneous groups (blocks), and within each block, units are randomly assigned to treatments.

Summary Table: Sampling Methods

Sampling Method

Description

Simple Random

Every sample of size n has equal chance of selection

Stratified

Population divided into strata, random sample from each stratum

Systematic

Every kth individual selected

Cluster

All individuals from randomly selected groups

Convenience

Sample easily obtained, not random

Additional info: These notes cover the foundational concepts of statistics, including definitions, types of variables, sampling methods, sources of bias, and experimental design. These are essential for understanding how to collect and analyze data in a statistically valid manner.

Pearson Logo

Study Prep