Skip to main content
Back

Introduction to Statistics: Key Concepts and Data Types

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Statistics

Overview

Statistics is the science of collecting, analyzing, interpreting, and presenting data. Understanding the foundational concepts of statistics is essential for making informed decisions based on data.

Statistical and Critical Thinking

Key Concepts

  • Statistics: The science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, and interpreting those data to draw conclusions.

  • Parameter: A numerical measurement describing some characteristic of a population.

  • Statistic: A numerical measurement describing some characteristic of a sample.

Understanding the difference between a parameter and a statistic is crucial, as it determines whether a value describes an entire population or just a subset (sample).

Types of Data

Quantitative vs. Categorical Data

  • Quantitative (Numerical) Data: Consist of numbers representing counts or measurements. Examples: The weights of supermodels, the ages of respondents.

  • Categorical (Qualitative or Attribute) Data: Consist of names or labels (not numbers that represent counts or measurements). Examples: The gender (male/female) of professional athletes, shirt numbers on uniforms.

Working with Quantitative Data

Quantitative data can be further classified as discrete or continuous:

  • Discrete Data: Result when the data values are quantitative and the number of values is finite or countable. Example: The number of tosses of a coin before getting tails.

  • Continuous Data: Result from infinitely many possible quantitative values, where the collection of values is not countable. Example: The lengths of distances from 0 cm to 12 cm.

Levels of Measurement

Classification of Data

Data can be classified into four levels of measurement, each with different properties and implications for statistical analysis:

  • Nominal Level: Data consist of names, labels, or categories only. The data cannot be arranged in any order. Example: Survey responses of yes, no, and undecided.

  • Ordinal Level: Data can be arranged in some order, but differences between data values are either not meaningful or cannot be determined. Example: Course grades A, B, C, D, or F.

  • Interval Level: Data can be arranged in order, and differences between data values are meaningful. However, there is no natural zero starting point. Example: Years 1000, 2000, 1776, and 1492.

  • Ratio Level: Data can be arranged in order, differences and ratios are meaningful, and there is a natural zero starting point. Example: Class times of 50 minutes and 100 minutes.

Level of Measurement

Characteristics

Example

Nominal

Categories only

Gender, colors

Ordinal

Categories with some order

Class rankings, letter grades

Interval

Differences but no natural zero point

Temperature in Celsius, years

Ratio

Differences and a natural zero point

Height, weight, time

Big Data and Data Science

Big Data

  • Big Data: Refers to data sets so large and complex that their analysis is beyond the capabilities of traditional software tools. Analysis may require parallel processing on many computers.

  • Data Science: Involves applications of statistics, computer science, and software engineering, along with other relevant fields such as sociology or finance.

Missing Data

Types of Missing Data

  • Missing Completely at Random (MCAR): The likelihood of a data value being missing is independent of its value or any other values in the data set.

  • Missing Not at Random (MNAR): The missing value is related to the reason that it is missing.

Correcting for Missing Data

  1. Delete Cases: Remove all subjects with any missing values from the analysis.

  2. Impute Missing Values: Substitute missing data values with estimated or predicted values.

Summary Table: Levels of Measurement

Level

Description

Nominal

Categories only

Ordinal

Categories with some order

Interval

Differences but no natural zero point

Ratio

Differences and a natural zero point

Pearson Logo

Study Prep