Skip to main content
Back

Chapter 2: Organizing Data – Study Notes for Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Organizing Data

Introduction

Organizing data is a foundational step in statistical analysis. This chapter covers the classification of variables and data, the distinction between parameters and statistics, and various methods for organizing and displaying both qualitative and quantitative data.

Variables and Data

Definitions

  • Individuals: The people or objects included in a study.

  • Variable: A characteristic of the individual to be measured or observed.

  • Data: The values of the variable collected from each individual.

Types of Data

  • Qualitative Data (Categorical): Consists of names or labels representing categories. Numbers are not used in a meaningful way. Example: Gender, survey responses (yes, no, undecided)

  • Quantitative Data: Consists of numbers for which operations such as addition or averaging make sense. Example: Heights, weights of individuals

Types of Quantitative Data

  • Discrete Data: Consists of numbers representing counts. Possible values can be listed or counted, and each value is distinct. Example: Number of TV sets in a household

  • Continuous Data: Results from infinitely many possible values that correspond to a continuous scale, covering a range without gaps. Example: Heights, weights, time

Types of Qualitative Data

  • Nominal Data: Names, labels, or categories with no implied order. Example: Blood group types, student majors

  • Ordinal Data: Data can be arranged in order, but differences between values are not meaningful. Example: Letter grades (A, B+, B), T-shirt sizes (small, medium, large)

Parameter vs. Statistic

Definitions

  • Parameter: A numerical measure that describes an aspect of a population.

  • Statistic: A numerical measure that describes an aspect of a sample.

Example

  • If 84.9% of all students on a campus have a job, this value is a parameter (population).

  • If a sample of 250 students shows 86.4% have a job, this value is a statistic (sample).

Frequency and Relative Frequency Distributions

Qualitative Data

A frequency distribution is a table that displays the values of a variable and how often each occurs.

Party

Frequency

Democratic

13

Republican

9

Other

18

Party

Relative Frequency

Democratic

0.325

Republican

0.225

Other

0.450

Graphical Representations

Pie Chart

A pie chart is a circle divided into sectors, each representing a category proportional to the total data. Useful for comparing a part to the whole.

Bar Graph

A bar graph displays categories on one axis and frequency or relative frequency on the other. Bars are of equal width and do not touch each other. Used to compare values of a variable.

Organizing Quantitative Data

Single Value Grouping

  • Each class represents a single possible value.

  • Suitable for discrete data with a small number of distinct values.

Number of TVs

Frequency

Relative Frequency

0

1

0.02

1

16

0.32

2

20

0.40

3

8

0.16

4

5

0.10

Limit Grouping

  • Used when data are whole numbers with too many distinct values for single value grouping.

  • Each class is a range of values, defined by lower and upper limits.

  • Class mark (midpoint): The average of the two class limits.

Formula for midpoint:

Days to Maturity

Frequency

Relative Frequency

30-39

3

0.075

40-49

7

0.175

50-59

8

0.200

60-69

10

0.250

70-79

6

0.150

80-89

4

0.100

90-99

2

0.050

Graphical Displays for Quantitative Data

Histogram

  • Displays classes of quantitative data on the horizontal axis and frequencies (or relative frequencies) on the vertical axis.

  • Bars touch each other, indicating continuous data.

  • For single-value grouping, use distinct values as labels; for limit grouping, use lower class limits or midpoints.

Dotplot

  • Shows each data value as a dot above its value on a horizontal axis.

  • Useful for visualizing the distribution and comparing data sets.

  1. Draw a horizontal axis for possible values.

  2. Place a dot for each observation above the appropriate value.

  3. Label the axis with the variable name.

Stem-and-Leaf Diagram

  • Each observation is split into a stem (all but the rightmost digit) and a leaf (the rightmost digit).

  • Stems are listed in a column; leaves are listed in rows next to their stems.

  • Leaves are arranged in ascending order.

Shapes of Distributions

Distribution of a Data Set

The distribution describes the values of observations and their frequencies. The shape of the distribution is crucial for selecting appropriate statistical methods.

  • Often visualized with a histogram and a smooth curve.

Common Distribution Shapes

  • Bell-shaped (Normal)

  • Triangular

  • Uniform (Rectangular)

  • J-shaped

  • Right-skewed

  • Left-skewed

  • Bimodal

  • Multimodal

What to Look for in Shapes?

  • Modality:

    • Unimodal (one peak)

    • Bimodal (two peaks)

    • Multimodal (three or more peaks)

  • Symmetry: Graph can be divided into two mirror-image parts.

  • Skewness:

    • Right-skewed: Tail is on the right side

    • Left-skewed: Tail is on the left side

Note: Exact symmetry is not required; focus on the overall pattern.

Additional info: These notes are based on standard introductory statistics curriculum and include all major concepts from the provided slides and text.

Pearson Logo

Study Prep