BackChapter 1: Introduction to Statistics – Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
1.1: Review and Preview
Key Definitions in Statistics
Data: Observations such as measurements, genders, or survey responses that have been collected.
Statistics (the subject): The science of methods for planning studies and experiments, collecting data, organizing, summarizing, presenting, analyzing, and drawing conclusions based on data.
Population: The complete collection of all individuals (people, objects, events, etc.) to be studied.
Census: A collection of data from every member of the population.
Sample: A subset of members selected from the population.
Example: Population vs. Sample
American citizens – Population
Marketing interns at a new division – Sample
All registered voters – Population
People at a mall – Sample
College football fields – Population
Important Note: Data must be collected in an appropriate way. If not, the results may be invalid.
Example: Identifying Population and Sample
"A poll of 1000 Americans asks: 'Do you agree that Global Warming is a phenomenon that is occurring with certainty?'"
Population of interest: All Americans
Sample: The 1000 Americans polled
1.2: Statistical and Critical Thinking
Steps in Statistical Analysis
Prepare: Understand the context, source, and sampling method.
Analyze: Graph the data, explore the data, and apply statistical methods.
Conclude: Assess statistical significance and practical significance.
Statistical vs. Practical Significance
Statistical significance: Results are unlikely to occur by chance.
Practical significance: Results are meaningful in real-world terms.
Common Pitfalls in Statistical Analysis
Misleading Conclusions: Correlation does not imply causation.
Self-Reported Data: May be unreliable due to bias or dishonesty.
Small Samples: May not represent the population well.
Loaded Questions: Wording can influence responses.
Order of Questions: Earlier questions can affect later responses.
Nonresponse: When selected subjects do not respond.
Missing Data: Can lead to invalid results if not handled properly.
Percentages: Misuse or misunderstanding of percentages can mislead.
1.3: Types of Data
Definitions
Individuals: Objects described by a set of data (can be people, animals, things, etc.).
Variable: A characteristic of an individual that can take different values.
Parameter: A numerical measurement describing some characteristic of a population.
Statistic: A numerical measurement describing some characteristic of a sample.
Quantitative vs. Qualitative Data
Qualitative (Categorical) Data: Non-numeric categories or labels (e.g., color, gender).
Quantitative Data: Numeric values representing counts or measurements.
Types of Quantitative Data
Discrete: Countable values (e.g., number of eggs).
Continuous: Any value within a range (e.g., height, weight).
1.4: Levels of Measurement
Nominal: Categories only (e.g., colors, names, labels).
Ordinal: Categories with a meaningful order, but differences are not meaningful (e.g., rankings).
Interval: Ordered, differences are meaningful, but no true zero (e.g., temperature in Celsius).
Ratio: Ordered, differences and ratios are meaningful, true zero exists (e.g., height, weight).
Example: Levels of Measurement
Variable | Level of Measurement |
|---|---|
Class Ranking | Ordinal |
Temperature in °F | Interval |
Political Party | Nominal |
Price of College Textbook | Ratio |
1.5: Collecting Sample Data
Sampling Methods
Simple Random Sample: Every member of the population has an equal chance of being selected.
Stratified Sampling: Population divided into subgroups (strata), and random samples taken from each stratum.
Cluster Sampling: Population divided into clusters, some clusters are randomly selected, and all members of chosen clusters are sampled.
Systematic Sampling: Select every kth member from a list after a random start.
Multistage Sampling: Combination of sampling methods, often used for large populations.
Other Sampling Methods
Convenience Sampling: Use results that are easy to get.
Voluntary Response Sampling: Individuals choose to participate.
Other Issues in Sampling
Undercoverage: Some groups in the population are left out.
Nonresponse: Selected individuals do not respond.
Response Bias: Behavior of respondent or interviewer influences results.
1.6: Experimental Design
Parts of an Experiment
Individuals: Subjects being studied.
Factors: Explanatory variables manipulated by the researcher.
Treatments: Different conditions applied to subjects.
Response Variable: Outcome measured in the experiment.
Levels: Different values of a factor.
Types of Experimental Design
Completely Randomized Design: Subjects are randomly assigned to treatments.
Block Design: Subjects are grouped into blocks based on a variable, then randomly assigned treatments within blocks.
Matched Pairs Design: Subjects are paired based on similarity, and each pair receives different treatments.
Experiment Terminology
Blinding: Subjects do not know which treatment they receive.
Double-Blind: Both subjects and researchers do not know treatment assignments.
Placebo Effect: Subjects respond to a treatment because they believe it is effective, not because it actually is.
Confounding: When the effects of two variables cannot be distinguished from each other.
Example: Completely Randomized Design
Suppose 120 subjects are randomly assigned to two groups: one receives a treatment, the other a placebo. The response variable is measured and compared between groups.
Example: Block Design
Subjects are grouped by age, then randomly assigned to treatments within each age group.
Example: Matched Pairs Design
Each subject receives both treatments in random order, or subjects are paired and each pair receives different treatments.
Additional info: These notes cover the foundational concepts of statistics, including definitions, types of data, sampling methods, and experimental design, as outlined in a typical introductory statistics course.