BackStatistical Measures: Central Tendency and Dispersion
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Measures of Central Tendency
Mean
The mean is the arithmetic average of a set of values and is commonly used to represent the central value in a data set.
Definition: The sum of all data values divided by the number of values.
Formula:
Example: For data set {2, 4, 6}, the mean is .
Median
The median is the middle value when data are arranged in order. It divides the data set into two equal halves.
Definition: The value separating the higher half from the lower half of a data set.
Calculation: If n is odd, median is the middle value; if n is even, median is the average of the two middle values.
Example: For {1, 3, 5}, median is 3; for {1, 3, 5, 7}, median is .
Mode
The mode is the value that appears most frequently in a data set.
Definition: The value(s) with the highest frequency.
Example: In {2, 2, 3, 4}, the mode is 2.
Mean of a Frequency Distribution
For grouped data, the mean is calculated using frequencies.
Formula:
Where: is the frequency of value .
Weighted Mean
The weighted mean accounts for varying importance (weights) of data values.
Formula:
Example: Grades with different credit hours.
Measures of Dispersion
Range
The range is the difference between the highest and lowest values in a data set.
Formula:
Standard Deviation
Standard deviation measures the spread of data around the mean.
Population Standard Deviation:
Sample Standard Deviation:
Variance
Variance is the average of squared deviations from the mean.
Population Variance:
Sample Variance:
Coefficient of Variation
The coefficient of variation expresses standard deviation as a percentage of the mean.
Formula:
Use: Comparing variability between different data sets.
Shape of a Distribution
Describes how data values are distributed around the mean.
Symmetric: Mean = Median = Mode
Skewed Right: Mean > Median > Mode
Skewed Left: Mean < Median < Mode
Empirical Rule
The empirical rule applies to normal distributions and describes the percentage of data within certain standard deviations from the mean.
Approximately 68% within 1 standard deviation
Approximately 95% within 2 standard deviations
Approximately 99.7% within 3 standard deviations
Chebyshev’s Theorem
Chebyshev’s theorem applies to all data sets, regardless of distribution shape.
At least of data values lie within k standard deviations of the mean (for ).
Example: For , at least 75% of data within 2 standard deviations.
Quartiles and Interquartile Range
Quartiles
Quartiles divide data into four equal parts.
Q1: 25th percentile
Q2: 50th percentile (median)
Q3: 75th percentile
Interquartile Range (IQR)
The interquartile range measures the spread of the middle 50% of data.
Formula:
Box and Whisker Plot
A box and whisker plot graphically displays the distribution of a data set using quartiles.
Shows minimum, Q1, median, Q3, and maximum values.
Helps identify outliers and visualize spread.
Identifying Outliers
Outliers are data values that are significantly different from others in the set.
Common rule: Outliers are values less than or greater than .
Z-Score
The z-score indicates how many standard deviations a value is from the mean.
Formula:
Use: Standardizing values for comparison.
Summary Table: Measures of Central Tendency and Dispersion
Measure | Definition | Formula |
|---|---|---|
Mean | Arithmetic average | |
Median | Middle value | Depends on data order |
Mode | Most frequent value | Highest frequency |
Range | Difference between max and min | |
Standard Deviation | Spread around mean | |
Variance | Average squared deviation | |
Coefficient of Variation | Relative variability | |
Interquartile Range | Middle 50% spread | |
Z-Score | Standardized value |
Additional info: These statistical concepts are foundational for analyzing biological and physiological data, but are not specific to Anatomy & Physiology. They are more relevant to introductory statistics or biostatistics courses.