Statistics Fundamentals: Chapters 1 & 2 Review

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Statistics Fundamentals: Chapters 1 & 2 Review

Introduction to Statistics

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions. It involves understanding populations and samples, and using various methods to summarize and draw conclusions from data.

Population: The entire collection of individuals or items under study.
Sample: A subset of the population selected for analysis.
Parameter: A numerical description of a population characteristic.
Statistic: A numerical description of a sample characteristic.
Branches of Statistics:
- Descriptive Statistics: Summarizes data using graphs, tables, and numerical measures.
- Inferential Statistics: Uses sample data to make generalizations about a population.

Types of Data

Data can be classified based on their nature and measurement.

Qualitative Data: Attributes, labels, or non-numerical entries.
Quantitative Data: Numbers that are measured or counted.

Levels of Measurement

Measurement levels determine the type of statistical analysis that can be performed.

Nominal: Qualitative categories without mathematical computations (e.g., gender, colors).
Ordinal: Data can be arranged in order, but differences are not meaningful (e.g., rankings).
Interval: Ordered data with meaningful differences; zero is not inherent (e.g., temperature in Celsius).
Ratio: Interval data with a true zero; ratios are meaningful (e.g., height, weight).

Statistical Study Designs

Statistical studies are designed to collect and analyze data effectively.

Observational Study: Researcher observes and measures characteristics without influencing the subjects.
Experiment: Researcher applies a treatment and observes its effects, often using randomization and replication.

Types of Samples

Sampling methods are used to select representative subsets from populations.

Stratified Sample: Population divided into strata, and samples are taken from each stratum.
Cluster Sample: Population divided into clusters, and entire clusters are randomly selected.
Systematic Sample: Every nth member is selected from a list after a random start.
Convenience Sample: Members are selected based on ease of access.

Frequency Distribution

Frequency distributions organize data into classes or intervals, showing the number of entries in each class.

Class Limits: The smallest and largest data values that can belong to a class.
Midpoint: The average of the lower and upper class limits.
Class Boundaries: Values that separate classes.
Frequency (f): Number of data entries in the class.
Relative Frequency: Proportion of data in each class.

Example Frequency Distribution Table

Class (lb)	Midpoint	Class Boundaries	Frequency	Relative Frequency	Cumulative Frequency
101-112	106.5	100.5-112.5	3	0.11	3
113-124	118.5	112.5-124.5	1	0.04	4
125-136	130.5	124.5-136.5	7	0.26	11
137-148	142.5	136.5-148.5	9	0.33	20
149-160	154.5	148.5-160.5	7	0.26	27

Graphical Representation of Data

Frequency Histogram: Bar graph representing frequency distribution; vertical axis shows frequency, horizontal axis shows class boundaries.
Ogive: Line graph showing cumulative frequency.
Stem-and-Leaf Plot: Displays data to show distribution and retain original values.

Measures of Central Tendency

Central tendency measures describe the center of a data set.

Mean: The average of data entries. Population Mean: Sample Mean:
Median: The middle value when data are ordered.
Mode: The value that occurs most frequently.

Measures of Variation

Variation measures describe the spread of data.

Range: Difference between the largest and smallest values.
Variance: Average of squared deviations from the mean. Population Variance: Sample Variance:
Standard Deviation: Square root of the variance. Population Standard Deviation: Sample Standard Deviation:

Example Calculation Table

Salary x	Deviation x - μ	Squares
48	-3.5	12.25
49	-2.5	6.25
51	-0.5	0.25
54	2.5	6.25
55	3.5	12.25
Sum		88.5

Empirical Rule (68-95-99.7 Rule)

The empirical rule describes the distribution of data in a normal (bell-shaped) curve.

About 68% of data lie within one standard deviation of the mean.
About 95% of data lie within two standard deviations of the mean.
About 99.7% of data lie within three standard deviations of the mean.

Coefficient of Variation (CV)

The coefficient of variation expresses the standard deviation as a percentage of the mean, allowing comparison between data sets with different units or means.

Population CV:
Sample CV:

Quartiles

Quartiles divide ordered data into four equal parts.

Q1: First quartile, median of the lower half of data.
Q2: Second quartile, median of the data set.
Q3: Third quartile, median of the upper half of data.

Z-scores

The z-score indicates how many standard deviations a value is from the mean.

Formula:
A z-score can be negative, positive, or zero.
For , the value equals the mean.

Summary Table: Key Statistical Measures

Measure	Definition	Formula
Mean	Average value
Median	Middle value	-
Mode	Most frequent value	-
Range	Difference between max and min
Variance	Average squared deviation
Standard Deviation	Square root of variance
Coefficient of Variation	SD as % of mean
Z-score	Standardized value

Additional info:

These notes cover foundational concepts in introductory statistics, suitable for college-level study and exam preparation.
Tables and formulas have been expanded for clarity and completeness.