BackIntroduction to Data Collection in Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 1: Data Collection
1.1 Introduction to the Practice of Statistics
Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. It also involves providing a measure of confidence in any conclusions. The process of statistics is foundational for making informed decisions in various fields such as science, engineering, and business.
Statistics: The discipline concerned with data collection, analysis, and interpretation.
Information: In statistics, information refers to data that has been processed and organized to be meaningful.
Example: A news service conducts a survey of 1006 adults aged 18 years or older in a certain country, August 2008, and asks whether they favor or oppose increasing the tax on gasoline to reduce dependence on foreign oil. The survey found that 60% opposed the increase. The goal is to use the sample to draw conclusions about the entire population of adults in the country.
Population: The entire group of individuals to be studied. Example: All adults aged 18 or older in the country.
Individual: A person or object that is a member of the population being studied. Example: One adult surveyed.
Sample: A subset of the population being studied. Example: The 1006 adults surveyed.
Descriptive Statistics: Consists of organizing and summarizing data, often through numerical summaries, tables, and graphs.
Statistic: A numerical measurement describing some characteristic of a sample.
Additional info: The distinction between population and sample is crucial for understanding how statistical inference works.
Types of Studies in Statistics
Designed Experiments
In a designed experiment, a researcher assigns individuals in a study to certain groups, intentionally changes the value of the explanatory variable, and records the value of the response variable for each group.
Pros: Can show cause and effect relationships; control over variables.
Cons: May be costly, time-consuming, or ethically challenging.
Confounding and Lurking Variables
Confounding Variable: Occurs when the effects of two or more explanatory variables are not separated, making it unclear which variable is responsible for changes in the response variable.
Lurking Variable: An explanatory variable that was not considered in a study but affects the value of the response variable. Lurking variables are typically related to both the explanatory and response variables.
Example: In a study examining the effect of exercise on weight loss, diet may be a lurking variable if not controlled.
Types of Observational Studies
Observational studies involve collecting data without influencing the variables being measured. They are useful for identifying associations but cannot establish causation.
Cross-Sectional Study: Collects information about individuals at a specific point in time or over a very short period. Pros: Quick, inexpensive. Cons: Cannot establish causality; may not reflect changes over time.
Case-Controlled Study (Retrospective Study): These studies are retrospective, requiring researchers to look at existing records. Individuals may be grouped by outcome. Pros: Useful for rare conditions; relatively quick. Cons: May rely on memory or existing records, which can be incomplete or biased.
Summary Table: Types of Studies
Type of Study | Description | Pros | Cons |
|---|---|---|---|
Designed Experiment | Researcher manipulates variables and assigns groups | Can show causation; control over variables | Costly; ethical issues |
Cross-Sectional Study | Data collected at one point in time | Quick; inexpensive | No causality; snapshot only |
Case-Controlled Study | Retrospective; groups by outcome | Good for rare events; quick | Recall bias; incomplete records |
Key Formulas and Definitions
Statistic:
Population Parameter: Additional info: Parameters are typically unknown and estimated using statistics from samples.