Skip to main content
Back

Fundamental Concepts and Applications in Introductory Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Experimental Design in Statistics

Controlled Experiments vs. Observational Studies

Understanding the difference between controlled experiments and observational studies is crucial in statistics, as it affects the interpretation of results and the ability to infer causality.

  • Controlled Experiment: The researcher actively manipulates one or more variables (treatment variables) to observe their effect on other variables (response variables), while controlling for confounding factors.

  • Observational Study: The researcher observes and records data without manipulating variables. No treatment is assigned by the researcher.

  • Example: A study where dogs are randomly assigned to receive either a restricted or unrestricted diet is a controlled experiment. If researchers simply observe dogs' diets and health outcomes without intervention, it is an observational study.

Variables in Statistical Studies

Variables are characteristics or properties that can vary among subjects in a study. They are classified as follows:

  • Explanatory (Independent) Variable: The variable that is manipulated or categorized to observe its effect on another variable. Also called the treatment variable.

  • Response (Dependent) Variable: The outcome or variable that is measured to assess the effect of the explanatory variable.

  • Example: In a diet restriction study on dogs, food intake is the treatment variable, and outcomes like weight, body fat, and lifespan are response variables.

Types of Data and Variables

Quantitative vs. Categorical Variables

Variables in statistics are broadly classified as quantitative or categorical, which determines the type of analysis and graphical representation used.

  • Quantitative (Numerical) Variables: Variables that represent measurable quantities and can be expressed numerically (e.g., age, height, weight).

  • Categorical (Qualitative) Variables: Variables that represent categories or groups (e.g., political party, sandwich type).

  • Example: Age is a quantitative variable, while sandwich choice (hamburger, cheeseburger, double double) is categorical.

Descriptive Statistics and Graphical Summaries

Graphical Representation of Data

Visualizing data helps in understanding distributions, patterns, and relationships among variables.

  • Bar Graphs and Pie Charts: Used for categorical variables to show the frequency or proportion of each category.

  • Histograms: Used for quantitative variables to display the distribution of data across intervals (bins).

  • Example: A bar graph or pie chart can display the distribution of political parties among US presidents. A histogram can show the distribution of ages at inauguration.

Describing Distributions

Key features of a distribution include its shape, center, and spread.

  • Shape: Common shapes include unimodal (one peak), bimodal (two peaks), symmetric, and skewed (left or right).

  • Center: Measures of central tendency include the mean and median.

  • Spread: Measures of variability include standard deviation and interquartile range (IQR).

  • Example: A histogram of ages at inauguration may be unimodal and roughly symmetric. A histogram of SAT scores may be skewed left or right.

Summary Statistics

Summary statistics provide numerical descriptions of data sets.

  • Mean (): The arithmetic average of a set of values.

  • Median: The middle value when data are ordered.

  • Standard Deviation (): Measures the average distance of data points from the mean.

  • Interquartile Range (IQR): The difference between the 75th percentile (Q3) and the 25th percentile (Q1).

  • Example: For a set of ages at inauguration: Mean = 55, Median = 55, Standard Deviation = 6.6, IQR = 7.

Interpreting and Analyzing Tables

Purpose of Tables in Statistics

Tables are used to organize and summarize data, often for comparison or classification.

Sandwich

18-22

23-30

31-50

51+

Total

Hamburger

22

15

10

5

52

Cheeseburger

35

30

25

13

103

Double Double

57

39

35

14

145

Total

114

84

70

32

300

Example Interpretation: The table above summarizes the sandwich preferences of 300 adults by age group. It allows calculation of proportions, such as the proportion of adults aged 31 or older, or the proportion of 18-22 year olds who ordered a cheeseburger.

Statistical Reasoning and Inference

Evaluating Effectiveness in Experiments

To determine if a treatment is effective, compare the outcomes between treatment and control groups.

  • Proportion Calculation: The proportion of patients with improved symptoms is calculated as:

  • Example: If 205 out of 270 patients who took medication improved, the proportion is .

  • Comparative Effectiveness: Compare the improvement rate in the treatment group to the placebo group to assess effectiveness.

Practice Problems and Applications

Sample Calculations

  • Mean Height Calculation: For heights 68, 64, 62, 77, 74:

  • Standard Deviation Calculation:

Describing Histograms

  • Unimodal, Skewed Left: A histogram with one peak and a longer tail to the left.

  • More Variable: The distribution with a wider spread or more uneven bars is more variable.

Summary Table: Types of Variables and Graphs

Variable Type

Examples

Appropriate Graph

Quantitative

Age, Height, SAT Score

Histogram, Boxplot

Categorical

Political Party, Sandwich Type

Bar Graph, Pie Chart

Additional info: These notes synthesize and expand upon the provided quiz and worksheet content, adding definitions, formulas, and context for clarity and completeness.

Pearson Logo

Study Prep