BackFoundations of Statistics: Variables, Sampling, and Experimental Design
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Variables and Data Types
Parameters and Statistics
In statistics, it is essential to distinguish between values that describe populations and those that describe samples.
Parameter: A numerical value that summarizes a characteristic of an entire population (e.g., the true mean height of all students in a school).
Statistic: A numerical value that summarizes a characteristic of a sample drawn from the population (e.g., the mean height of a sample of students).
Example: If 51.6% of all residents in a city are female, this is a parameter.
Types of Variables
Variables are characteristics or properties that can take on different values. They are classified as follows:
Qualitative (Categorical) Variables: Variables that describe qualities or categories (e.g., gender, color).
Quantitative Variables: Variables that are measured numerically (e.g., age, weight).
Discrete Variable: A quantitative variable that takes on a countable number of values (e.g., number of items purchased).
Continuous Variable: A quantitative variable that can take on any value within a range (e.g., total time spent shopping).
Example: The number of items bought is discrete; the total time spent is continuous.
Levels of Measurement
Variables can be measured at different levels:
Nominal: Categories with no inherent order (e.g., types of fruit).
Ordinal: Categories with a meaningful order but not equal intervals (e.g., rankings).
Interval: Ordered categories with equal intervals but no true zero (e.g., temperature in Celsius).
Ratio: Like interval, but with a true zero point (e.g., weight, height).
Example: The weight capacity of a backpack is measured at the ratio level.
Sampling and Data Collection
Populations and Samples
Understanding the difference between populations and samples is fundamental in statistics.
Population: The entire group of individuals or items of interest (e.g., all American households).
Sample: A subset of the population selected for study (e.g., 1,242 American households surveyed).
Individuals: The objects or people described by the data (e.g., each household in the sample).
Sampling Methods
There are several methods for selecting samples from a population:
Census: Data collected from every member of the population.
Random Sampling: Every member of the population has an equal chance of being selected.
Simple Random Sample: A sample chosen in such a way that every possible sample of the same size has an equal chance of being selected.
Stratified Sample: The population is divided into subgroups (strata), and random samples are taken from each stratum.
Systematic Sample: Every nth member of the population is selected.
Cluster Sample: The population is divided into clusters, some clusters are randomly selected, and all members of chosen clusters are surveyed.
Convenience Sample: Samples are chosen based on ease of access.
Example: Interviewing all passengers on five randomly selected cruises is a cluster sample.
Sources of Bias in Sampling
Bias can occur when certain members of the population are more likely to be included in the sample than others.
Sampling Bias: Occurs when the sample is not representative of the population, often due to the sampling method.
Example: Surveying the first 300 people listed in a town's telephone directory may introduce sampling bias if not all residents are equally likely to be listed.
Observational Studies and Experiments
Observational Studies
In observational studies, researchers observe subjects without manipulating variables.
Definition: A study in which the researcher does not assign treatments but observes existing conditions or behaviors.
Example: A poll on musicians' ages is an observational study.
Experiments
Experiments involve the deliberate application of treatments to study their effects.
Definition: A controlled study in which treatments are applied to experimental units, and the effect of varying these treatments on a response variable is observed.
Treatments: The conditions applied to experimental units.
Factors: The variables that are manipulated in an experiment.
Blinding: Concealing the treatment assignment from subjects (single-blind) or both subjects and experimenters (double-blind) to reduce bias.
Single-Blind Experiment: The subject does not know which treatment they are receiving.
Double-Blind Experiment: Neither the subject nor the experimenter knows the treatment assignment.
Example: If the subject does not know which treatment they are receiving, it is a single-blind experiment.
Experimental Design
Proper experimental design ensures valid and reliable results.
Completely Randomized Design: Subjects are randomly assigned to treatments.
Matched-Pairs Design: Subjects are paired based on similarities, and each pair receives different treatments.
Randomized Block Design: Subjects are divided into blocks based on a variable, and treatments are randomly assigned within each block.
Blocking: Grouping experimental units with similar characteristics to control for those variables.
Key Statistical Vocabulary and Skills
To excel in statistics, students should be able to:
Identify and understand scenarios using statistical vocabulary.
Provide or justify responses using statistical vocabulary.
Judge statements based on statistical vocabulary.
Summary Table: Sampling Methods
Sampling Method | Description | Example |
|---|---|---|
Simple Random Sample | Every member has an equal chance of selection | Drawing names from a hat |
Stratified Sample | Population divided into strata, random samples from each | Sampling students from each grade level |
Systematic Sample | Selecting every nth member | Surveying every 10th person on a list |
Cluster Sample | Randomly select clusters, survey all within | Interviewing all passengers on selected cruises |
Convenience Sample | Sample chosen for ease of access | Surveying people at a mall entrance |
Key Formulas
Sample Mean:
Population Mean:
Sample Proportion: