Skip to main content
Back

Comprehensive Study Notes for Introductory Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Statistics and Collecting Data

What are Statistics?

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make decisions. It provides tools for understanding and drawing conclusions from data.

  • Data Set: A collection of all outcomes, responses, measurements, or counts that are of interest.

  • Population: The entire group of individuals or items under study.

  • Sample: A subset of the population, selected for analysis.

Parameter and Statistic

  • Parameter: A numerical description of a population characteristic. Example: Average age of all people in the United States.

  • Statistic: A numerical description of a sample characteristic. Example: Average age of people from a sample of three states.

Branches of Statistics

  • Descriptive Statistics: Involves the organization, summarization, and display of data. Examples: Tables, charts, averages.

  • Inferential Statistics: Involves using sample data to draw conclusions about a population.

Types of Data and Variables

Types of Data

  • Qualitative Data: Consists of attributes, labels, or nonnumerical entries. Example: Colors, names, eye color.

  • Quantitative Data: Numerical measurements or counts. Example: Age, temperature, height.

Levels of Measurement

  • Nominal: Qualitative data only, categorized using names, labels, or qualities. No mathematical computations can be made.

  • Ordinal: Qualitative or quantitative data, can be ordered or ranked, but differences are not meaningful.

  • Interval: Quantitative data, can be ordered, and meaningful differences can be calculated. Zero is not an inherent zero (does not mean "none").

  • Ratio: Similar to interval, but zero is inherent (means "none"). Ratios of data values can be formed.

Designing a Statistical Study

Steps in Designing a Study

  1. Identify the variables of interest and the population.

  2. Develop a detailed plan for data collection.

  3. Collect the data.

  4. Describe the data using descriptive statistics.

  5. Interpret the data using inferential statistics.

  6. Identify any possible errors.

Data Collection Methods

  • Observational Study: Researcher observes and measures characteristics without influencing the population.

  • Experiment: Researcher applies a treatment and observes responses.

  • Simulation: Uses a model to reproduce conditions of a situation or process.

  • Survey: Collects data from people by asking questions.

Sampling Methods

Types of Sampling

  • Census: Data collected from every member of the population.

  • Sampling: Data collected from a subset of the population.

  • Random Sample: Every member has an equal chance of being selected.

  • Stratified Sample: Population divided into groups (strata), and a random sample is taken from each group.

  • Cluster Sample: Population divided into clusters, some clusters are randomly selected, and all members of selected clusters are surveyed.

  • Systematic Sample: Every nth member of the population is selected.

  • Convenience Sample: Only members that are easy to reach are selected.

Organizing and Summarizing Data

Describing Distributions with Graphs

  • Histograms: Visualize the distribution of quantitative data.

  • Skewed Right: Tail on the right side is longer; mean > median.

  • Skewed Left: Tail on the left side is longer; mean < median.

  • Symmetric: Both sides are approximately mirror images.

Numerically Summarizing Data

Measures of Center and Spread

  • Mean: The average of all values.

  • Median: The middle value when data are ordered.

  • Mode: The value that occurs most frequently.

  • Standard Deviation (s or σ): Measures the average distance of data points from the mean.

  • Interquartile Range (IQR): The range between the first (Q1) and third quartiles (Q3).

Probability and Discrete Probability Distributions

Basic Probability Concepts

  • Probability of an Event:

  • Mean of a Discrete Random Variable:

Empirical Rule (68-95-99.7 Rule)

  • About 68% of data within 1 standard deviation of the mean.

  • About 95% within 2 standard deviations.

  • About 99.7% within 3 standard deviations.

The Normal Probability Distribution

Standard Normal Distribution and Z-Scores

  • The normal distribution is symmetric and bell-shaped, characterized by mean () and standard deviation ().

  • Z-Score Formula:

Sampling Distributions and Estimation

Sampling Distribution of the Sample Mean

  • Standard Error of the Mean:

Confidence Intervals

  • Estimate a population parameter using sample data, providing a range of plausible values.

  • Confidence Interval for Mean (σ known):

  • Confidence Interval for Proportion:

Hypothesis Testing

Formulating and Testing Hypotheses

  • Null Hypothesis (H₀): Statement being tested, usually a statement of no effect or no difference.

  • Alternative Hypothesis (H₁): The statement we are seeking evidence for.

  • Test Statistic for One Mean:

  • Test Statistic for Two Means:

Inference on Two Population Parameters

Comparing Two Means

  • When comparing means from two independent samples, use a two-sample t-test.

  • Test statistic for two means (see above).

Inference on Categorical Data

Estimating Population Proportions

  • Sample proportions can be used to estimate population proportions and construct confidence intervals.

  • See confidence interval for proportions above.

Probability Tables and Expected Value

Using Probability Tables

  • Probability tables summarize the likelihood of different outcomes in a random experiment or process.

  • Expected Value:

Identifying Outliers

1.5*IQR Rule for Outliers

  • Lower Fence:

  • Upper Fence:

  • Values outside these fences are considered outliers.

Summary Table: Key Statistical Formulas

Concept

Formula (LaTeX)

Mean

Standard Deviation

Z-Score

Confidence Interval (mean, σ known)

Confidence Interval (proportion)

Test Statistic (one mean)

Test Statistic (two means)

Pearson Logo

Study Prep