BackIntroduction to Statistics: Populations, Samples, and Variables
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Introduction to Statistics
What is Statistics?
Statistics is the science of collecting, analyzing, interpreting, and presenting data. It provides a collection of tools to answer questions about data, especially for making inferences about a larger group based on a smaller, representative sample.
Definition: Statistics is the art or science of collecting and analyzing data, especially for the purpose of inferring properties about a population from a representative sample.
Purpose: To make conclusions about a large group (population) by studying a smaller group (sample).
Data: Information gathered from experiments and surveys.
Populations and Samples
Definitions
Population: The set of all subjects or items of interest in a study. Keywords include "all" or "every" (e.g., "all SC residents").
Sample: A subset of the population for which data is actually collected, often selected randomly. Keyphrases include "a sample of..." or "a survey of..." (e.g., "a sample of 1,000 SC residents").
Parameter: A numerical summary describing a characteristic of a population (e.g., the mean age of all residents).
Statistic: A numerical summary describing a characteristic of a sample (e.g., the mean age of sampled residents).
Example
If a company has a database with records for every employee, you are working with a population. The median income calculated is a parameter.
If the database contains records for only a subset of employees, you are working with a sample. The median income calculated is a statistic, and you use inference to predict the parameter for all employees.
Identifying Population, Sample, Statistic, and Parameter
Given a scenario, you should be able to identify these four elements:
Population: All state residents
Sample: A random sample of 1,000 state residents
Statistic: 20% of those sampled are in favor of the change
Parameter: 15% of all residents are in favor of the change
Descriptive vs. Inferential Statistics
Descriptive Statistics
Descriptive statistics provide useful summaries and help you understand the data you collected. They include methods for summarizing data from a sample or population.
Examples: Mean, median, standard deviation, frequency tables
Inferential Statistics
Inferential statistics help you make predictions or decisions about a population based on data from a sample. This includes hypothesis testing and estimation.
Examples: Confidence intervals, hypothesis tests
Sampling
Why Sample?
It is often impractical or impossible to collect data from every item in a population due to constraints such as time, cost, and feasibility. Sampling allows us to study a manageable subset and make inferences about the whole population.
Studying every item takes more time
Costs more money
Makes it harder to manage data collection
Variables in Statistics
Definition of Variables
A variable is a characteristic or property that can take on different values for different individuals or items in a study.
Examples: Number of absences, commute time, age
Types of Variables
Quantitative Variable: Represents a measured quantity and is expressed as numbers. Can be further classified as:
Discrete Quantitative Variable: Represents counts (e.g., number of absences).
Continuous Quantitative Variable: Represents measurements that can take any value within a range (e.g., commute time in hours).
Categorical Variable (Qualitative): Consists of values that represent categories or groups (e.g., gender, type of car).
Identifying Variable Types
Ask: Are the results numbers?
If yes, are they counts (discrete) or measurements (continuous)?
If not, the variable is categorical.
Examples of Variable Types
Variable | Type |
|---|---|
Number of Absences | Discrete Quantitative |
Commute Time (in hours) | Continuous Quantitative |
Age (in years) | Discrete Quantitative (if treated as integers) |
Gender | Categorical |
Summary Table: Key Terms
Term | Definition | Example |
|---|---|---|
Population | All subjects of interest | All SC residents |
Sample | Subset of the population | 1,000 SC residents surveyed |
Parameter | Numerical summary of a population | Mean age of all SC residents |
Statistic | Numerical summary of a sample | Mean age of sampled residents |
Key Formulas
Sample Mean:
Population Mean: