Skip to main content
Back

Numerical Descriptive Measures in Business Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Numerical Descriptive Measures

Introduction

This chapter introduces the key numerical descriptive measures used in business statistics to summarize and describe the properties of numerical data. The focus is on measures of central tendency, variation, and shape, as well as methods for visualizing and interpreting data distributions.

Properties of Numerical Variables

Central Tendency, Variation, and Shape

  • Central Tendency: Indicates the extent to which values of a numerical variable group around a typical or central value.

  • Variation: Describes the amount of dispersion or scattering away from a central value.

  • Shape: Refers to the pattern of the distribution of values from the lowest to the highest value.

Measures of Central Tendency

The Mean

  • The arithmetic mean (or simply "mean") is the most common measure of central tendency.

  • For a sample of size :

  • The mean is affected by extreme values (outliers).

The Median

  • The median is the "middle" value in an ordered array (50% above, 50% below).

  • Less sensitive to extreme values than the mean.

  • If the number of values is odd, the median is the middle number; if even, it is the average of the two middle numbers.

The Mode

  • The mode is the value that occurs most often in a data set.

  • Not affected by extreme values.

  • Can be used for both numerical and categorical data.

  • There may be no mode or several modes.

Choosing the Appropriate Measure

  • The mean is generally used unless outliers exist.

  • The median is preferred when data contain outliers.

  • In many cases, both mean and median are reported for a more complete summary.

Summary Table: Measures of Central Tendency

Measure

Definition

Formula

Arithmetic Mean

Sum of values divided by number of values

Median

Middle value in ordered array

Depends on data order

Mode

Most frequently observed value

Identified by frequency

Geometric Mean

Rate of change over time

Measures of Variation

Range

  • The range is the simplest measure of variation: the difference between the largest and smallest values.

  • Range = Maximum value - Minimum value

  • Can be misleading as it does not account for data distribution and is sensitive to outliers.

Sample Variance

  • The sample variance is the average of squared deviations from the mean.

  • Formula:

Sample Standard Deviation

  • The standard deviation is the square root of the variance and is the most commonly used measure of variation.

  • Formula:

  • Has the same units as the original data.

  • Steps for calculation:

    1. Compute the difference between each value and the mean.

    2. Square each difference.

    3. Add the squared differences.

    4. Divide by to get the sample variance.

    5. Take the square root to get the standard deviation.

Coefficient of Variation (CV)

  • Measures relative variation as a percentage of the mean.

  • Formula:

  • Useful for comparing variability between data sets with different units or means.

Locating Extreme Outliers: Z-Score

  • The Z-score indicates how many standard deviations a data value is from the mean.

  • Formula:

  • Values with are considered extreme outliers.

  • Example: If the mean SAT score is 490 and the standard deviation is 100, a score of 620 has and is not an outlier.

Shape of a Distribution

Skewness

  • Measures the extent to which data values are not symmetrical.

  • Left-skewed: Mean < Median; Skewness < 0

  • Symmetric: Mean = Median; Skewness = 0

  • Right-skewed: Median < Mean; Skewness > 0

Kurtosis

  • Measures the "peakedness" of the distribution curve.

  • Leptokurtic: Sharper peak than bell-shaped (Kurtosis > 0)

  • Mesokurtic: Bell-shaped (Kurtosis = 0)

  • Platykurtic: Flatter than bell-shaped (Kurtosis < 0)

Exploring Numerical Data Using Quartiles

Quartiles and the Five-Number Summary

  • Quartiles split ranked data into four segments with equal numbers of values.

  • First quartile (): 25% of values are smaller.

  • Second quartile (): Median (50% of values are smaller).

  • Third quartile (): 75% of values are smaller.

  • Five-number summary: Minimum, , Median, , Maximum.

Locating Quartiles

  • To find the position of a quartile in ranked data:

    • First quartile:

    • Second quartile:

    • Third quartile:

    where is the number of observed values.

  • If the result is a whole number, use that position; if a fractional half, average the two corresponding values; otherwise, round to the nearest integer.

Interquartile Range (IQR)

  • The IQR measures the spread of the middle 50% of the data:

  • Not influenced by outliers; considered a resistant measure of variability.

Boxplots

  • A boxplot is a graphical display based on the five-number summary.

  • Shows the center, spread, and shape of the data distribution.

  • Boxplots can reveal symmetry, skewness, and outliers.

Descriptive Measures for a Population

  • Population parameters are denoted with Greek letters.

  • Population mean:

  • Population variance:

  • Population standard deviation:

The Empirical Rule

  • For symmetric, mound-shaped distributions:

    • ~68% of data within 1 standard deviation of the mean

    • ~95% within 2 standard deviations

    • ~99.7% within 3 standard deviations

Relationships Between Two Numerical Variables

Covariance

  • Measures the strength of the linear relationship between two variables and .

  • Sample covariance:

  • Positive covariance: variables move in the same direction; negative: move in opposite directions.

  • Magnitude does not indicate strength of relationship.

Coefficient of Correlation

  • Measures the relative strength and direction of a linear relationship between two variables.

  • Sample correlation coefficient:

  • Ranges from -1 (perfect negative) to +1 (perfect positive); 0 indicates no linear relationship.

Ethical Considerations in Data Analysis

  • Summary measures should be reported objectively and fairly.

  • Both positive and negative results must be documented.

  • Inappropriate use of summary measures can distort facts.

Pearson Logo

Study Prep