Statistical Measures: Central Tendency and Dispersion

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Measures of Central Tendency

Mean

The mean is the arithmetic average of a set of values and is commonly used to represent the central value in a data set.

Definition: The sum of all data values divided by the number of values.
Formula:
Example: For data set {2, 4, 6}, the mean is .

Median

The median is the middle value when data are arranged in order. It divides the data set into two equal halves.

Definition: The value separating the higher half from the lower half of a data set.
Calculation: If n is odd, median is the middle value; if n is even, median is the average of the two middle values.
Example: For {1, 3, 5}, median is 3; for {1, 3, 5, 7}, median is .

Mode

The mode is the value that appears most frequently in a data set.

Definition: The value(s) with the highest frequency.
Example: In {2, 2, 3, 4}, the mode is 2.

Mean of a Frequency Distribution

For grouped data, the mean is calculated using frequencies.

Formula:
Where: is the frequency of value .

Weighted Mean

The weighted mean accounts for varying importance (weights) of data values.

Formula:
Example: Grades with different credit hours.

Measures of Dispersion

Range

The range is the difference between the highest and lowest values in a data set.

Formula:

Standard Deviation

Standard deviation measures the spread of data around the mean.

Population Standard Deviation:
Sample Standard Deviation:

Variance

Variance is the average of squared deviations from the mean.

Population Variance:
Sample Variance:

Coefficient of Variation

The coefficient of variation expresses standard deviation as a percentage of the mean.

Formula:
Use: Comparing variability between different data sets.

Shape of a Distribution

Describes how data values are distributed around the mean.

Symmetric: Mean = Median = Mode
Skewed Right: Mean > Median > Mode
Skewed Left: Mean < Median < Mode

Empirical Rule

The empirical rule applies to normal distributions and describes the percentage of data within certain standard deviations from the mean.

Approximately 68% within 1 standard deviation
Approximately 95% within 2 standard deviations
Approximately 99.7% within 3 standard deviations

Chebyshev’s Theorem

Chebyshev’s theorem applies to all data sets, regardless of distribution shape.

At least of data values lie within k standard deviations of the mean (for ).
Example: For , at least 75% of data within 2 standard deviations.

Quartiles and Interquartile Range

Quartiles

Quartiles divide data into four equal parts.

Q1: 25th percentile
Q2: 50th percentile (median)
Q3: 75th percentile

Interquartile Range (IQR)

The interquartile range measures the spread of the middle 50% of data.

Formula:

Box and Whisker Plot

A box and whisker plot graphically displays the distribution of a data set using quartiles.

Shows minimum, Q1, median, Q3, and maximum values.
Helps identify outliers and visualize spread.

Identifying Outliers

Outliers are data values that are significantly different from others in the set.

Common rule: Outliers are values less than or greater than .

Z-Score

The z-score indicates how many standard deviations a value is from the mean.

Formula:
Use: Standardizing values for comparison.

Summary Table: Measures of Central Tendency and Dispersion

Measure	Definition	Formula
Mean	Arithmetic average
Median	Middle value	Depends on data order
Mode	Most frequent value	Highest frequency
Range	Difference between max and min
Standard Deviation	Spread around mean
Variance	Average squared deviation
Coefficient of Variation	Relative variability
Interquartile Range	Middle 50% spread
Z-Score	Standardized value

Additional info: These statistical concepts are foundational for analyzing biological and physiological data, but are not specific to Anatomy & Physiology. They are more relevant to introductory statistics or biostatistics courses.