Skip to main content
Back

Chapter 3: Calculating Descriptive Statistics – Business Statistics Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 3: Calculating Descriptive Statistics

3.1 Measures of Central Tendency

Measures of central tendency are statistical values that describe the center point or typical value of a dataset. The main measures include the mean, median, mode, and weighted mean.

Mean

  • Definition: The mean (or average) is the sum of all values divided by the number of observations.

  • Sample Mean Formula:

Sample mean notationSum notation for sample meanSample size notation

The formula for the sample mean is:

Sample mean formula

  • Population Mean Formula:

Population mean formula

  • Example: For the sample values 87.2, 118.9, 76.2, 107.7, 61.5, the sample mean is:

Sample mean calculation example

Weighted Mean

  • Definition: The weighted mean assigns different weights to values, reflecting their relative importance.

  • Formula:

Weighted mean formula

  • Example: Suppose your statistics grade is based on an exam, project, and homework with different weights.

Weighted mean example tableWeighted mean calculation tableWeighted mean calculation table continuedWeighted mean calculation table continuedWeighted mean calculation sum of weightsWeighted mean calculation sum of weighted scores

The weighted mean is:

Weighted mean calculation formula

Median

  • Definition: The median is the middle value when data are arranged in ascending order. If the number of values is even, it is the average of the two middle values.

  • Calculation: Use the index point , rounding up if not a whole number.

  • Example: For n = 9, the median is the 5th value in the sorted list.

Median calculation exampleMedian calculation example continued

Mode

  • Definition: The mode is the value that appears most frequently in a dataset. There can be more than one mode or none at all.

  • Example (Numerical Data):

Mode example with dress sizes

  • Example (Categorical Data):

Mode example with TV brands

Shapes of Frequency Distributions

  • Symmetric: Mean = Median

  • Right-Skewed: Mean > Median

  • Left-Skewed: Mean < Median

Symmetric distributionRight-skewed distributionLeft-skewed distribution

Using Excel for Central Tendency

  • Excel functions: AVERAGE, MEDIAN, MODE.SNGL

  • Excel may not always identify all modes or handle categorical data for mode.

Excel calculation of mean, median, modeExcel Data Analysis toolExcel Descriptive Statistics outputExcel Descriptive Statistics output continued

Choosing the Appropriate Measure

  • Mean: Use when data are symmetric and without outliers.

  • Median: Use when data are skewed or contain outliers.

  • Mode: Use for categorical data.

Advantages and disadvantages of mean, median, mode

3.2 Measures of Variability

Measures of variability describe the spread or dispersion of data values. Common measures include range, variance, and standard deviation.

Range

  • Definition: The range is the difference between the highest and lowest values in a dataset.

  • Formula: Range = Highest value – Lowest value

  • Advantage: Simple to calculate.

  • Disadvantage: Only considers two values and is sensitive to outliers.

Variance and Standard Deviation

  • Variance: Measures the average squared deviation from the mean.

  • Sample Variance Formula:

  • Standard Deviation: The square root of the variance, with the same units as the original data.

  • Sample Standard Deviation Formula:

Sample variance calculation tableSample variance calculation formulaSample standard deviation calculation

Population Variance and Standard Deviation

  • Population Variance Formula:

  • Population Standard Deviation Formula:

Population variance formulaPopulation variance calculationPopulation variance calculation continuedPopulation variance calculation continued

Excel for Variability

  • Sample: =VAR.S(), =STDEV.S()

  • Population: =VAR.P(), =STDEV.P()

Excel Descriptive Statistics for variability

3.3 Using the Mean and Standard Deviation Together

The mean and standard deviation are often used together to describe the center and spread of data, especially in quality control and business applications.

Coefficient of Variation (CV)

  • Definition: The CV expresses the standard deviation as a percentage of the mean, allowing comparison of variability between datasets with different units or means.

  • Formula (Sample):

  • Formula (Population):

  • Example: Comparing Microsoft and Amazon stock price variability.

Stock price table for CV exampleCV calculation for Microsoft and Amazon

z-Score

  • Definition: The z-score indicates how many standard deviations a value is from the mean.

  • Formula (Sample):

  • Formula (Population):

  • Interpretation: z = 0 (at mean), z > 0 (above mean), z < 0 (below mean). Outliers typically have |z| > 3.

  • Example: Calculating z-score for a hamburger calorie value.

z-score calculation exampleHamburger calorie table for z-score example

The Empirical Rule

  • For bell-shaped (normal) distributions:

  • ~68% of values within ±1 standard deviation

  • ~95% within ±2 standard deviations

  • ~99.7% within ±3 standard deviations

Empirical rule 68%Empirical rule 95%Empirical rule 99.7%

3.4 Working with Grouped Data

When data are grouped into frequency distributions, the mean and variance can be estimated using class midpoints and frequencies.

Mean of Grouped Data

  • Formula (Sample):

  • Where: = frequency of class i, = midpoint of class i, = total observations, = number of classes

  • Example: Calculating mean age from grouped survey data.

Grouped data frequency tableGrouped data midpoints tableGrouped data mean calculation

3.5 Measures of Relative Position

These measures compare the position of a value relative to the rest of the data, including percentiles, quartiles, and the interquartile range (IQR).

Percentiles

  • Definition: The pth percentile is the value below which p% of the data fall.

  • Calculation: Sort data, compute index .

Quartiles

  • Q1: 25th percentile

  • Q2: 50th percentile (median)

  • Q3: 75th percentile

Interquartile Range (IQR)

  • Definition: IQR = Q3 – Q1; describes the spread of the middle 50% of data.

Box-and-Whisker Plots

  • Graphical summary showing quartiles, minimum, maximum, and outliers.

Outliers

  • Values outside Q1 – 1.5(IQR) or Q3 + 1.5(IQR) are considered outliers.

3.6 Measures of Association Between Two Variables

These statistics describe the relationship between two variables, including covariance and correlation.

Sample Covariance

  • Definition: Measures the direction of the linear relationship between two variables.

  • Formula:

Sample Correlation Coefficient

  • Definition: Measures both the strength and direction of the linear relationship between two variables.

  • Formula:

  • Range: -1 (perfect negative) to +1 (perfect positive); 0 means no linear relationship.

Additional info: These notes cover all major descriptive statistics relevant to business statistics, including formulas, examples, and Excel applications. For further study, refer to the textbook for more detailed examples and practice problems.

Pearson Logo

Study Prep