Skip to main content
Back

Chapter 1: Data Collection – Structured Study Notes for Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 1 – Data Collection

1.1 Introduction to the Practice of Statistics

Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. It also involves providing a measure of confidence in any conclusions.

  • Definition of Statistics: The discipline concerned with data collection and analysis for decision-making.

  • Data: Facts or propositions used to draw conclusions or make decisions.

  • Population: The entire group to be studied.

  • Individual: A single member of the population.

  • Sample: A subset of the population selected for study.

  • Statistic: A numerical summary based on a sample.

  • Parameter: A numerical summary based on a population.

Example: Suppose the percentage of all students on campus who have a job is 84%. This is a parameter. If a sample of 250 students is taken and 86% have a job, this is a statistic.

1.2 The Process of Statistics

The process of statistics involves several key steps:

  1. Identify the research objective: Define the question to be answered and the population to be studied.

  2. Collect the data: Gather information relevant to the research objective.

  3. Describe the data: Organize and summarize the information.

  4. Draw conclusions: Make inferences and decisions based on the data.

Variables: Characteristics of individuals within the population.

  • Qualitative (Categorical) Variables: Classify individuals based on attributes or characteristics.

  • Quantitative Variables: Provide numerical measures; values can be added or subtracted for meaningful results.

Example: Gender, temperature, number of days studied, zip code.

1.3 Types of Quantitative Variables

  • Discrete Variable: Has a finite or countable number of possible values (e.g., number of heads after flipping a coin five times).

  • Continuous Variable: Has an infinite number of possible values, often measured (e.g., time between 12:00 pm and 1:00 pm).

All observations are called data. Data can be qualitative or quantitative, discrete or continuous.

1.4 Levels of Measurement

Variables can be assigned a level of measurement:

  • Nominal: Values are names, labels, or categories; no ranking.

  • Ordinal: Values can be ranked; order matters.

  • Interval: Differences between values are meaningful; no true zero.

  • Ratio: Differences and ratios are meaningful; true zero exists.

Example: Gender (nominal), temperature (interval), days studied (ratio), letter grade (ordinal).

1.5 Observational Studies vs. Designed Experiments

There are two main ways to collect data: observational studies and designed experiments.

  • Observational Study: Measures the value of the response variable without influencing the value of the explanatory variable.

  • Designed Experiment: Researcher intentionally changes the explanatory variable and records the response.

Confounding: Occurs when the effects of two or more explanatory variables are not separated.

  • Lurking Variable: An unmeasured variable that affects the response.

Types of Observational Studies:

  • Cross-sectional: Collects information at a specific point in time.

  • Case-control: Retrospective; requires looking at existing records.

  • Cohort: Prospective; follows a group over time.

Census: A list of all individuals in a population and their characteristics.

1.6 Sampling Methods

Sampling is the process of selecting individuals from the population to estimate characteristics of the whole group.

  • Simple Random Sampling: Each individual has an equal chance of being selected.

  • Systematic Sampling: Select every k-th individual from a list.

  • Stratified Sampling: Divide the population into groups (strata) and sample from each group.

  • Cluster Sampling: Select groups (clusters) at random, then sample all individuals within those clusters.

  • Convenience Sampling: Individuals are easily obtained; may introduce bias.

Example: To select a simple random sample of size n from a population of size N:

  • Obtain a frame listing all individuals.

  • Assign numbers 1 to N.

  • Use a random number generator to select n numbers.

1.7 Bias in Sampling

Bias occurs when the sample is not representative of the population.

  • Sampling Bias: Technique favors one part of the population.

  • Undercoverage: Some segments of the population are underrepresented.

  • Nonresponse Bias: Individuals do not respond or have different opinions than those who do.

  • Response Bias: Answers do not reflect true feelings due to misinterpretation, interviewer error, or question wording.

1.8 The Design of Experiments

Experiments are designed to determine the effect of changing one or more explanatory variables.

  • Treatment: Any combination of explanatory variables.

  • Experimental Unit: The person, object, or item upon which the treatment is applied.

  • Control Group: Baseline group for comparison.

  • Placebo: An innocuous medication or treatment.

  • Blinding: Non-disclosure of treatment to experimental units.

  • Single-blind: Subject does not know which treatment is received.

  • Double-blind: Neither subject nor researcher knows which treatment is received.

Example: A cholesterol-lowering drug study with a placebo-controlled, double-blind design.

1.9 Experimental Designs

  • Completely Randomized Design: Experimental units are randomly assigned to treatments.

  • Matched-Pairs Design: Units are paired up based on similarity; each pair receives different treatments.

Example: An educational psychologist studies the effect of music on learning by matching students according to IQ and gender.

Summary Table: Types of Variables

Type

Description

Examples

Qualitative

Describes attributes or categories

Gender, zip code

Quantitative (Discrete)

Countable values

Number of students, number of cars

Quantitative (Continuous)

Measured values, infinite possibilities

Height, temperature

Key Formulas

  • Sample Mean:

  • Population Mean:

Additional info: These notes expand on the original content by providing definitions, examples, and structured explanations suitable for college-level statistics students.

Pearson Logo

Study Prep