BackIntroduction to Statistics: Key Concepts and Data Types
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Introduction to Statistics
Overview
Statistics is the science of collecting, analyzing, interpreting, and presenting data. Understanding the foundational concepts of statistics is essential for making informed decisions based on data.
Statistical and Critical Thinking
Key Concepts
Statistics: The science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, and interpreting those data to draw conclusions.
Parameter: A numerical measurement describing some characteristic of a population.
Statistic: A numerical measurement describing some characteristic of a sample.
Understanding the difference between a parameter and a statistic is crucial, as it determines whether a value describes an entire population or just a subset (sample).
Types of Data
Quantitative vs. Categorical Data
Quantitative (Numerical) Data: Consist of numbers representing counts or measurements. Examples: The weights of supermodels, the ages of respondents.
Categorical (Qualitative or Attribute) Data: Consist of names or labels (not numbers that represent counts or measurements). Examples: The gender (male/female) of professional athletes, shirt numbers on uniforms.
Working with Quantitative Data
Quantitative data can be further classified as discrete or continuous:
Discrete Data: Result when the data values are quantitative and the number of values is finite or countable. Example: The number of tosses of a coin before getting tails.
Continuous Data: Result from infinitely many possible quantitative values, where the collection of values is not countable. Example: The lengths of distances from 0 cm to 12 cm.
Levels of Measurement
Classification of Data
Data can be classified into four levels of measurement, each with different properties and implications for statistical analysis:
Nominal Level: Data consist of names, labels, or categories only. The data cannot be arranged in any order. Example: Survey responses of yes, no, and undecided.
Ordinal Level: Data can be arranged in some order, but differences between data values are either not meaningful or cannot be determined. Example: Course grades A, B, C, D, or F.
Interval Level: Data can be arranged in order, and differences between data values are meaningful. However, there is no natural zero starting point. Example: Years 1000, 2000, 1776, and 1492.
Ratio Level: Data can be arranged in order, differences and ratios are meaningful, and there is a natural zero starting point. Example: Class times of 50 minutes and 100 minutes.
Level of Measurement | Characteristics | Example |
|---|---|---|
Nominal | Categories only | Gender, colors |
Ordinal | Categories with some order | Class rankings, letter grades |
Interval | Differences but no natural zero point | Temperature in Celsius, years |
Ratio | Differences and a natural zero point | Height, weight, time |
Big Data and Data Science
Big Data
Big Data: Refers to data sets so large and complex that their analysis is beyond the capabilities of traditional software tools. Analysis may require parallel processing on many computers.
Data Science: Involves applications of statistics, computer science, and software engineering, along with other relevant fields such as sociology or finance.
Missing Data
Types of Missing Data
Missing Completely at Random (MCAR): The likelihood of a data value being missing is independent of its value or any other values in the data set.
Missing Not at Random (MNAR): The missing value is related to the reason that it is missing.
Correcting for Missing Data
Delete Cases: Remove all subjects with any missing values from the analysis.
Impute Missing Values: Substitute missing data values with estimated or predicted values.
Summary Table: Levels of Measurement
Level | Description |
|---|---|
Nominal | Categories only |
Ordinal | Categories with some order |
Interval | Differences but no natural zero point |
Ratio | Differences and a natural zero point |