BackIntroduction to Statistics: Concepts, Variables, and Study Design
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
1. Introduction to the Practice of Statistics
1.1 What is Statistics?
Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. It also involves providing a measure of confidence in any conclusions drawn. A key aspect of statistics is understanding and describing variability in data.
Objective: To describe and understand sources of variability.
1.2 The Process of Statistics
The process of statistics involves several key steps to ensure valid and reliable conclusions:
Identify the Research Objective: Clearly state the question or objective of the study.
Collect Data: Obtain data relevant to the research objective, typically from a population or a sample.
Organize and Summarize Data: Use tables, charts, and numerical summaries to describe the data.
Draw Conclusions: Use statistical methods to make inferences about the population based on the sample data.
1.3 Parameters vs. Statistics
Parameter: A numerical summary that describes a characteristic of a population.
Statistic: A numerical summary that describes a characteristic of a sample.
Example: If you survey 100 students and find that 55% prefer online classes, 55% is a statistic. If you know that 60% of all students in the university prefer online classes, 60% is a parameter.
2. Types of Variables
2.1 Qualitative vs. Quantitative Variables
Qualitative (Categorical) Variables: Variables that classify individuals into categories or groups. Examples: gender, color, type of car.
Quantitative Variables: Variables that provide numerical measures of individuals. Arithmetic operations such as addition and subtraction can be performed. Examples: height, weight, age.
2.2 Discrete vs. Continuous Variables
Discrete Variable: A quantitative variable that has a finite or countable number of possible values (e.g., number of students in a class).
Continuous Variable: A quantitative variable that has an infinite number of possible values within a given range (e.g., height, weight).
2.3 Levels of Measurement
Nominal Level: Values are names, labels, or categories with no inherent order (e.g., gender, color).
Ordinal Level: Values can be arranged in a ranked or specific order, but differences between values are not meaningful (e.g., class rankings).
Interval Level: Values have meaningful differences, but there is no true zero point (e.g., temperature in Celsius).
Ratio Level: Values have meaningful differences and a true zero point, allowing for ratios (e.g., height, weight).
Example: Classifying Variables
Education level: ordinal, qualitative
Temperature: interval, quantitative
Number of vending machines: ratio, quantitative
Student present for class: nominal, qualitative
3. Sources of Data and Types of Studies
3.1 Sources of Data
Census: Data collected from all individuals in a population.
Existing Sources: Data collected previously and available for analysis (e.g., government databases).
Collecting Data: Data collected specifically for the current study, often through surveys or experiments.
3.2 Observational Studies vs. Experiments
Observational Study: Observes individuals and measures variables without attempting to influence responses. Cannot establish causation, only association.
Experiment: Deliberately imposes a treatment on individuals to observe their responses. Can establish causation.
3.3 Types of Observational Studies
Cross-sectional Study: Observes individuals at a single point in time or over a very short period.
Case–control Study: Retrospective; compares individuals with a certain characteristic (cases) to those without (controls), often looking back in time.
Cohort Study: Follows a group (cohort) of individuals over a long period to observe outcomes.
Example Table: Types of Observational Studies
Study Type | Time Frame | Key Feature |
|---|---|---|
Cross-sectional | Present | Snapshot at one point in time |
Case–control | Past | Retrospective comparison of cases and controls |
Cohort | Future | Follow group over time to observe outcomes |
4. Confounding and Lurking Variables
4.1 Confounding Variables
Confounding occurs when the effects of two or more explanatory variables are not separated, making it unclear which variable is causing changes in the response variable.
4.2 Lurking Variables
A lurking variable is an explanatory variable that was not considered in a study but affects the value of the response variable. Lurking variables can lead to confounding.
4.3 Explanatory vs. Response Variables
Explanatory Variable: The variable that is manipulated or categorized to observe its effect.
Response Variable: The outcome or variable that is measured in the study.
Example: Happiness and Heart Disease
Explanatory Variable: Level of happiness
Response Variable: Occurrence of heart disease
Type of Study: Cohort
5. Summary Table: Key Terms and Definitions
Term | Definition |
|---|---|
Population | Entire group of individuals to be studied |
Sample | Subset of the population selected for study |
Parameter | Numerical summary of a population |
Statistic | Numerical summary of a sample |
Qualitative Variable | Classifies individuals into categories |
Quantitative Variable | Numerical measure of individuals |
Discrete Variable | Countable number of possible values |
Continuous Variable | Infinite number of possible values within a range |
6. Important Formulas
Sample Proportion:
Population Proportion:
Additional info: These notes provide foundational concepts for understanding statistics, including types of variables, study design, and the distinction between parameters and statistics. Mastery of these topics is essential for further study in statistics and for interpreting data in real-world contexts.