Skip to main content
Back

Introduction to Statistics: Data, Variables, and Sampling Methods

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

What is Statistics?

Definition and Scope

Statistics is the science of generalizing data to understand and describe groups using information collected from smaller samples. It involves collecting, organizing, analyzing, and interpreting data to draw conclusions about populations.

  • Data: Collections of observations, such as measurements, genders, or survey responses.

  • Statistics (the field): The science of planning studies and experiments, obtaining data, and drawing conclusions based on the data.

Process Involved in a Statistical Study

Prepare, Analyze, Conclude

The statistical study process consists of three main steps: prepare, analyze, and conclude.

  • Prepare: Define the context, source, and sampling method. Ask: What is the goal? What data is needed?

  • Analyze: Organize and summarize the data, explore relationships, and apply statistical methods.

  • Conclude: Assess the significance of results and determine practical implications.

Population vs. Sample

Definitions and Differences

Understanding the distinction between a population and a sample is fundamental in statistics.

  • Population: The entire group of interest.

  • Sample: A subset of the population selected for measurement or analysis.

  • Parameter: A numerical measure describing a characteristic of a population.

  • Statistic: A numerical measure describing a characteristic of a sample.

Example: The average height of all adult female Canadians is a population parameter; the average height from a sample of adult female Canadians is a sample statistic.

Term

Definition

Population

All members of a group of interest

Sample

Subset of the population

Parameter

Numerical measure of a population

Statistic

Numerical measure of a sample

Data Basics and Variables

Data Matrices and Observation Units

Data is often organized in a data matrix, where columns represent variables (characteristics measured) and rows represent observation units (individual cases or subjects).

  • Variables: Characteristics such as gender, pulse, body mass index, etc.

  • Observation Unit: Each row in the data matrix; an individual case.

  • Sample Statistics: Calculations (e.g., mean, median) performed on the sample data.

Types of Variables

Quantitative vs. Qualitative Data

Variables can be classified as quantitative (numerical) or qualitative (categorical).

  • Quantitative Data: Numbers representing counts or measurements (e.g., height, weight).

  • Categorical Data: Names or labels representing categories (e.g., gender, ethnicity).

Subtypes of Data

  • Discrete Data: Result from a countable number of values (e.g., number of students).

  • Continuous Data: Result from infinitely many possible values within a range (e.g., height, weight).

  • Nominal Data: Categorical data without a natural order (e.g., blood type).

  • Ordinal Data: Categorical data with a natural order (e.g., education level).

Type

Subtype

Example

Quantitative

Discrete

Number of books

Quantitative

Continuous

Height

Categorical

Nominal

Blood type

Categorical

Ordinal

Education level

Sampling Methods

Types of Sampling

Sampling methods determine how samples are selected from populations.

  • Simple Random Sampling (SRS): Every member has an equal chance of being selected.

  • Stratified Sampling: Population divided into strata; samples drawn from each stratum.

  • Cluster Sampling: Population divided into clusters; entire clusters are sampled.

  • Multistage Sampling: Combination of cluster and stratified sampling; samples are drawn in stages.

  • Convenience Sampling: Samples are chosen based on ease of access.

Sampling Method

Description

Simple Random

Equal chance for all cases

Stratified

Sample from each subgroup

Cluster

Sample entire clusters

Multistage

Sample in multiple stages

Convenience

Easy to get samples

Other Sampling Terms

  • Available Data: Data collected in the past for other purposes.

  • Sample Survey: Data collected from a sample.

  • Census: Data collected from all cases in a population.

Observational Studies and Experiments

Observational Studies

In an observational study, researchers observe and measure characteristics without influencing the subjects.

  • Example: Surveys are common observational studies.

Experiments

In an experiment, a treatment is applied to subjects, and the effect is measured.

  • Treatment Group: Receives the treatment.

  • Control Group: Does not receive the treatment.

  • Factor (Explanatory Variable): Variable whose effect is studied.

  • Response Variable: Variable whose values are compared.

Example: Vaccine trials where one group receives the vaccine and another receives a placebo.

Blind Tests

  • Single-blind: Subjects do not know which group they are in.

  • Double-blind: Neither subjects nor researchers know group assignments.

Key Formulas

Population and Sample Statistics

  • Population Mean:

  • Sample Mean:

Summary Table: Types of Data

Type

Subtype

Example

Quantitative

Discrete

Number of students

Quantitative

Continuous

Height, weight

Categorical

Nominal

Blood type

Categorical

Ordinal

Education level

Summary Table: Sampling Methods

Method

Description

Simple Random

Equal chance for all cases

Stratified

Sample from each subgroup

Cluster

Sample entire clusters

Multistage

Sample in multiple stages

Convenience

Easy to get samples

Additional info: These notes provide foundational concepts for introductory statistics, including definitions, examples, and key distinctions necessary for further study in the field.

Pearson Logo

Study Prep