Skip to main content
Back

The Standard Deviation as a Ruler and the Normal Model

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 5: The Standard Deviation as a Ruler and the Normal Model

Shifting and Scaling Data: Effects on Summary Statistics

Understanding how transformations such as shifting and scaling affect the mean, standard deviation, and other summary statistics is fundamental in statistics. These operations are commonly applied to data for standardization or adjustment purposes.

  • Shifting the Data: Adding a constant c to each observation in the dataset.

    • Measures of center (mean, median) increase by c.

    • Measures of spread (variance, standard deviation, range, interquartile range) remain unchanged.

  • Example: If midterm scores are 40, 50, 60, 70, 90 (mean = 62, median = 60), and each student receives 10 extra points, the new mean is 72 and the new median is 70. The spread (e.g., standard deviation) does not change.

  • Scaling the Data: Multiplying each observation by a positive constant c.

    • Both measures of center and spread are multiplied by c.

    • Variance is multiplied by c^2.

  • Example: If each score is increased by 10% (i.e., multiplied by 1.1), the new mean and median are 68.2 and 66, respectively. The variance becomes 1.12 times the original variance, and other measures of spread are multiplied by 1.1.

Standardizing Data: The z-score

Standardization allows comparison of observations from different distributions or scales by expressing values in terms of their distance from the mean, measured in standard deviations. This is achieved using the z-score.

  • Definition: The z-score of an observation y with mean \bar{y} and standard deviation s is:

  • A positive z-score indicates the observation is above the mean; a negative z-score indicates it is below the mean.

  • Interpretation:

    • (k positive): Observation is k standard deviations above the mean.

    • (k positive): Observation is k standard deviations below the mean.

    • : Observation equals the mean.

The Normal Model

Many real-world phenomena produce data that are approximately bell-shaped and symmetric, such as birth weights and adult pulse rates. The Normal model is used to describe such distributions.

  • Key Characteristics:

    • Bell-shaped and unimodal

    • Perfectly symmetric about the mean \mu

    • Spread determined by the standard deviation \sigma

    • Denoted as

    • Parameters: The mean \mu and standard deviation \sigma are parameters of the model (population values), while sample mean \bar{y} and sample standard deviation s are statistics (sample values).

  • Examples of Questions:

    • What percentage of newborn babies are heavier than 5.0 kg?

    • What proportion of adults have pulse rates between 72 and 90 beats per minute?

Standardizing Values from the Normal Model

For a variable following the Normal model , the z-score is calculated as:

  • If the data are truly Normal, the z-scores themselves follow the Standard Normal model .

  • To check if the Normal model is appropriate, examine if the histogram of the data is bell-shaped and symmetric.

The 68-95-99.7 Rule (Empirical Rule)

This rule describes the approximate percentage of data within certain intervals around the mean in a Normal distribution:

Interval

% of data falling within the interval

Within 1 of

About 68%

Within 2 of

About 95%

Within 3 of

About 99.7%

  • This rule helps quickly estimate probabilities and identify outliers in a Normal distribution.

Applications and Examples

  • IQ Scores: If IQ scores for ages 20-34 are Normal with and :

    • What percentage have IQ below 160?

    • What percentage have IQ between 90 and 120?

    • What is the IQ such that only 0.15% score higher?

  • Quality Control Example: A paint machine dispenses dye with a Normal distribution (standard deviation 0.4 mL). If more than 6 mL is unacceptable, what mean setting ensures only 2% of cans are unacceptable?

Using Statistical Software (R Commander) for Normal Calculations

  • To find areas (probabilities): Use "Normal probabilities" and specify the value, mean, and standard deviation. The software provides lower or upper tail areas (as proportions).

  • To find percentiles (quantiles): Use "Normal quantiles" and specify the area (as a proportion), mean, and standard deviation. Choose lower or upper tail as appropriate.

  • No need to standardize manually when using these software tools.

Pearson Logo

Study Prep