BackStatistics Fundamentals: Chapters 1 & 2 Review
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Statistics Fundamentals: Chapters 1 & 2 Review
Introduction to Statistics
Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions. It involves understanding populations and samples, and using various methods to summarize and draw conclusions from data.
Population: The entire collection of individuals or items under study.
Sample: A subset of the population selected for analysis.
Parameter: A numerical description of a population characteristic.
Statistic: A numerical description of a sample characteristic.
Branches of Statistics:
Descriptive Statistics: Summarizes data using graphs, tables, and numerical measures.
Inferential Statistics: Uses sample data to make generalizations about a population.
Types of Data
Data can be classified based on their nature and measurement.
Qualitative Data: Attributes, labels, or non-numerical entries.
Quantitative Data: Numbers that are measured or counted.
Levels of Measurement
Measurement levels determine the type of statistical analysis that can be performed.
Nominal: Qualitative categories without mathematical computations (e.g., gender, colors).
Ordinal: Data can be arranged in order, but differences are not meaningful (e.g., rankings).
Interval: Ordered data with meaningful differences; zero is not inherent (e.g., temperature in Celsius).
Ratio: Interval data with a true zero; ratios are meaningful (e.g., height, weight).
Statistical Study Designs
Statistical studies are designed to collect and analyze data effectively.
Observational Study: Researcher observes and measures characteristics without influencing the subjects.
Experiment: Researcher applies a treatment and observes its effects, often using randomization and replication.
Types of Samples
Sampling methods are used to select representative subsets from populations.
Stratified Sample: Population divided into strata, and samples are taken from each stratum.
Cluster Sample: Population divided into clusters, and entire clusters are randomly selected.
Systematic Sample: Every nth member is selected from a list after a random start.
Convenience Sample: Members are selected based on ease of access.
Frequency Distribution
Frequency distributions organize data into classes or intervals, showing the number of entries in each class.
Class Limits: The smallest and largest data values that can belong to a class.
Midpoint: The average of the lower and upper class limits.
Class Boundaries: Values that separate classes.
Frequency (f): Number of data entries in the class.
Relative Frequency: Proportion of data in each class.
Example Frequency Distribution Table
Class (lb) | Midpoint | Class Boundaries | Frequency | Relative Frequency | Cumulative Frequency |
|---|---|---|---|---|---|
101-112 | 106.5 | 100.5-112.5 | 3 | 0.11 | 3 |
113-124 | 118.5 | 112.5-124.5 | 1 | 0.04 | 4 |
125-136 | 130.5 | 124.5-136.5 | 7 | 0.26 | 11 |
137-148 | 142.5 | 136.5-148.5 | 9 | 0.33 | 20 |
149-160 | 154.5 | 148.5-160.5 | 7 | 0.26 | 27 |
Graphical Representation of Data
Frequency Histogram: Bar graph representing frequency distribution; vertical axis shows frequency, horizontal axis shows class boundaries.
Ogive: Line graph showing cumulative frequency.
Stem-and-Leaf Plot: Displays data to show distribution and retain original values.
Measures of Central Tendency
Central tendency measures describe the center of a data set.
Mean: The average of data entries. Population Mean: Sample Mean:
Median: The middle value when data are ordered.
Mode: The value that occurs most frequently.
Measures of Variation
Variation measures describe the spread of data.
Range: Difference between the largest and smallest values.
Variance: Average of squared deviations from the mean. Population Variance: Sample Variance:
Standard Deviation: Square root of the variance. Population Standard Deviation: Sample Standard Deviation:
Example Calculation Table
Salary x | Deviation x - μ | Squares |
|---|---|---|
48 | -3.5 | 12.25 |
49 | -2.5 | 6.25 |
51 | -0.5 | 0.25 |
54 | 2.5 | 6.25 |
55 | 3.5 | 12.25 |
Sum | 88.5 | |
Empirical Rule (68-95-99.7 Rule)
The empirical rule describes the distribution of data in a normal (bell-shaped) curve.
About 68% of data lie within one standard deviation of the mean.
About 95% of data lie within two standard deviations of the mean.
About 99.7% of data lie within three standard deviations of the mean.
Coefficient of Variation (CV)
The coefficient of variation expresses the standard deviation as a percentage of the mean, allowing comparison between data sets with different units or means.
Population CV:
Sample CV:
Quartiles
Quartiles divide ordered data into four equal parts.
Q1: First quartile, median of the lower half of data.
Q2: Second quartile, median of the data set.
Q3: Third quartile, median of the upper half of data.
Z-scores
The z-score indicates how many standard deviations a value is from the mean.
Formula:
A z-score can be negative, positive, or zero.
For , the value equals the mean.
Summary Table: Key Statistical Measures
Measure | Definition | Formula |
|---|---|---|
Mean | Average value | |
Median | Middle value | - |
Mode | Most frequent value | - |
Range | Difference between max and min | |
Variance | Average squared deviation | |
Standard Deviation | Square root of variance | |
Coefficient of Variation | SD as % of mean | |
Z-score | Standardized value |
Additional info:
These notes cover foundational concepts in introductory statistics, suitable for college-level study and exam preparation.
Tables and formulas have been expanded for clarity and completeness.