Skip to main content
Back

Organizing Data: Types of Variables and Frequency Distributions

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 2: Organizing Data

Introduction

This chapter introduces fundamental concepts in statistics related to organizing and classifying data. It covers the definition and types of variables, distinctions between qualitative and quantitative data, and methods for summarizing categorical data using frequency tables and charts.

Variables and Data

Definition of Variables

  • Variable: A characteristic or property that can take on different values for different individuals or objects. Examples include height, weight, number of siblings, sex, marital status, and zipcode.

  • Data: The observed values of variables collected from individuals or objects.

Example Table: Types of Variables

height

weight

siblings

sex

marriage

zipcode

5

110

0

female

married

97219

6.3

180

4

male

married

97405

5.7

145

2

male

single

97219

Types of Variables

Qualitative vs. Quantitative Variables

  • Qualitative (Categorical) Variables: Variables that describe qualities or categories and are non-numerically valued. Examples: sex, marital status, zipcode.

  • Quantitative Variables: Variables that are numerically valued and represent counts or measurements. Examples: height, weight, number of siblings.

Subtypes of Quantitative Variables

  • Discrete Variables: Quantitative variables that take on a countable number of distinct values, often representing counts. Example: number of siblings.

  • Continuous Variables: Quantitative variables that can take on any value within a given range, typically representing measurements. Example: height, weight.

Classification Diagram

  • Variable

    • Qualitative

    • Quantitative

      • Discrete

      • Continuous

Examples: Identifying Variable Types

  • Number of people in your household: Quantitative, Discrete

  • Height of waterfalls: Quantitative, Continuous

  • Finishing time of marathon runners: Quantitative, Continuous

  • Order of finish in a running competition: Qualitative (ordinal)

  • Global Industry Classification Standard (GICS) code: Qualitative (categorical)

Organizing Qualitative Data

Frequency Distribution Table

A frequency distribution table summarizes categorical data by listing each category and the number of observations in each.

Example: Stocks Classified by GICS Sector

Stocks

GICS_Sector

Apple

Information Technology

Microsoft

Information Technology

Johnson & Johnson

Healthcare

Pfizer

Healthcare

Exxon Mobil

Energy

Chevron

Energy

Walmart

Consumer Staples

Procter & Gamble

Consumer Staples

Coca-Cola

Consumer Staples

Amazon

Consumer Discretionary

Tesla

Consumer Discretionary

Home Depot

Consumer Discretionary

JP Morgan Chase

Financials

Goldman Sachs

Financials

Visa

Financials

Berkshire Hathaway

Financials

AT&T

Communication Services

Verizon

Communication Services

Duke Energy

Utilities

American Electric Power

Utilities

Grouping Data by Category

GICS_Sector

Stocks

Communication Services

AT&T, Verizon

Consumer Discretionary

Amazon, Tesla, Home Depot

Consumer Staples

Walmart, Procter & Gamble, Coca-Cola

Energy

Exxon Mobil, Chevron

Financials

JP Morgan Chase, Goldman Sachs, Visa, Berkshire Hathaway

Healthcare

Johnson & Johnson, Pfizer

Information Technology

Apple, Microsoft

Utilities

Duke Energy, American Electric Power

Frequency Table: Counting Observations per Category

GICS_Sector

Frequency

Communication Services

2

Consumer Discretionary

3

Consumer Staples

3

Energy

2

Financials

4

Healthcare

2

Information Technology

2

Utilities

2

Relative Frequency Table

Relative frequency expresses the proportion of observations in each category, calculated as:

GICS_Sector

Relative_Frequency

Communication Services

0.10

Consumer Discretionary

0.15

Consumer Staples

0.15

Energy

0.10

Financials

0.20

Healthcare

0.10

Information Technology

0.10

Utilities

0.10

Pie Chart Representation

  • A pie chart visually represents categorical data, dividing a circle into wedge-shaped pieces proportional to the relative frequencies of each category.

  • Each sector's wedge size reflects its proportion in the dataset.

Summary Table: Types of Variables

Type

Description

Examples

Qualitative

Non-numerical, describes categories or qualities

Sex, Marital Status, Zipcode, GICS Sector

Quantitative - Discrete

Numerical, countable values

Number of siblings, Number of people in household

Quantitative - Continuous

Numerical, measurable values within a range

Height, Weight, Marathon finishing time

Key Points

  • Variables are classified as qualitative or quantitative, with quantitative variables further divided into discrete and continuous types.

  • Frequency and relative frequency tables are essential tools for summarizing categorical data.

  • Pie charts provide a visual summary of the distribution of categorical data.

Example Application

In portfolio analysis, stocks can be grouped by sector, and the frequency and relative frequency of each sector can be calculated to understand the composition of the portfolio.

Additional info: Relative frequency values in the table are inferred based on a total of 20 stocks. The summary table of variable types is added for academic completeness.

Pearson Logo

Study Prep