BackModule 1 Study Notes: Types of Data, Frequency Distributions, and Descriptive Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
1.2 Types of Data
Quantitative vs. Qualitative Data
Data can be classified as either quantitative (numerical) or qualitative (categorical). Understanding the type of data is essential for selecting appropriate statistical methods.
Quantitative Data: Data that can be measured and expressed numerically (e.g., height, age, income).
Qualitative Data: Data that describes qualities or categories (e.g., gender, color, type of animal).
Discrete vs. Continuous Data
Discrete Data: Countable values, often integers (e.g., number of cars, number of students).
Continuous Data: Can take any value within a range (e.g., height, weight, temperature).
Examples and Applications
Number of traffic intersections (discrete)
Amount of rainfall (continuous)
Color of a car (qualitative)
2.1 Frequency Distributions
Constructing Frequency Tables
A frequency distribution organizes data into classes or intervals and shows how many data points fall into each class.
Class Limits: The smallest and largest data values that can belong to a class.
Class Boundaries: Values that separate classes without gaps.
Class Width: The difference between lower limits of consecutive classes.
Relative Frequency
The relative frequency of a class is the proportion of the total data that falls within that class.
Formula:
Example Table
Range (Home Value) | Frequency |
|---|---|
100-109 | 8 |
110-119 | 7 |
120-129 | 12 |
130-139 | 11 |
140-149 | 7 |
Additional info: Students may be asked to identify lower/upper class limits, class width, and class boundaries from such tables.
2.4 Scatterplots, Correlation, and Regression
Scatterplots
A scatterplot is a graph that shows the relationship between two quantitative variables. Each point represents an observation.
Positive Correlation: As one variable increases, the other tends to increase.
Negative Correlation: As one variable increases, the other tends to decrease.
No Correlation: No apparent relationship between variables.
Correlation Coefficient
Measures the strength and direction of a linear relationship between two variables.
Symbol:
Range:
Regression
Regression analysis estimates the relationship between variables, often using a line of best fit.
Equation of a regression line:
3.1 Measures of Center
Mean, Median, and Mode
Measures of center describe the typical value in a data set.
Mean (Average):
Median: The middle value when data are ordered.
Mode: The value that occurs most frequently.
Example
Data: 7, 12, 15, 15, 15, 8, 10, 14, 15
Mean:
Median: 12
Mode: 15
3.2 Measures of Variation
Range, Variance, and Standard Deviation
Range: Difference between the highest and lowest values.
Variance: Average of squared deviations from the mean.
Sample variance:
Standard Deviation: Square root of the variance.
Sample standard deviation:
Interpretation
A small standard deviation indicates data are clustered near the mean.
A large standard deviation indicates data are spread out.
3.3 Measures of Relative Standing
Percentiles and Z-Scores
Percentile: Indicates the value below which a given percentage of observations fall.
Z-Score: Measures how many standard deviations a value is from the mean.
Formula:
Applications
Comparing scores from different distributions.
Identifying outliers (values with or are often considered outliers).
Additional info: Students may be asked to find percentiles for given values using a provided table.