BackChapter 3: Constructing Graphical and Tabular Displays of Data
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Constructing Graphical and Tabular Displays of Data (3.1)
Types of Variables
In statistics, understanding the types of variables is fundamental for selecting appropriate methods of data display and analysis.
Categorical variable: Consists of names or labels that represent groups or categories of individuals. Examples include academic major, ZIP code, or gender (coded as 0 and 1).
Numerical variable: Consists of measurable quantities that describe individuals, such as age or maximum speed of a car.
Example: Identifying Variable Types
Age (in years) of a hip-hop artist: Numerical
Academic major of a student: Categorical
Maximum speed of a car: Numerical
ZIP code of a home: Categorical
Gender coded as 0 (man) and 1 (woman): Categorical
Frequency and Frequency Tables
Frequency tables are essential tools for summarizing categorical data.
Frequency of a category: The number of observations in that category.
Frequency table: A table listing all categories and their frequencies.
Definitions
Frequency distribution of a categorical variable: The categories of the variable together with their frequencies.
Relative frequency of a category: The proportion of all observations that fall in that category, calculated as:
Example: Frequency and Relative Frequency Table
Consider a class survey on favorite movie genres:
Category | Frequency | Relative Frequency |
|---|---|---|
Drama | 5 | 0.119 |
Action | 6 | 0.143 |
Thriller | 4 | 0.095 |
Comedy | 16 | 0.381 |
Horror | 5 | 0.119 |
Other | 6 | 0.143 |
Total | 42 | 1.000 |
Largest relative frequency: Comedy (0.381 or 38.1%)
Relative Frequency Distribution
The relative frequency distribution of a categorical variable lists the categories together with their relative frequencies. The sum of all relative frequencies for a categorical variable is always 1.
Bar Graphs
Bar graphs are graphical representations of frequency or relative frequency distributions for categorical data.
Frequency bar graph: Categories are displayed on the horizontal axis, frequencies on the vertical axis. Each bar's height corresponds to the frequency of the category.
Example: Constructing a Frequency Bar Graph
List categories (e.g., movie genres) on the horizontal axis.
Mark frequency values on the vertical axis.
Draw bars above each category to represent its frequency.
Logical Operations with Categories: AND/OR
Logical operations such as AND and OR are used to identify subsets of data that meet specific criteria.
AND: Selects data points that satisfy both conditions.
OR: Selects data points that satisfy at least one condition.
Example: Calendar Dates
Thursdays in November 2020: 5, 12, 19, 26
Dates in the third week: 15, 16, 17, 18, 19, 20, 21
Dates in the third week AND are Thursdays: 19
Dates in the third week OR are Thursdays: 5, 12, 15, 16, 17, 18, 19, 20, 21, 26
Interpreting Multiple Bar Graphs
Multiple bar graphs allow comparison of proportions across different groups for each category.
Proportion: The height of each bar represents the proportion of individuals in a group (e.g., men or women) who fall into a category (e.g., political party).
Comparison: Compare proportions between groups for each category.
Example: Survey of Political Party Affiliation
Proportion of women who are Democrats: 0.37
Proportion of men who are Independents: 0.43
Comparison: A smaller proportion of women (0.38) than men (0.43) identified as Independents.
Absolute numbers: More women (411) than men (378) identified as Independents due to a larger sample of women.
Sampling error: Survey results may not exactly represent the entire population due to sampling variability.
Important Note on Survey Interpretation
Survey proportions apply only to the surveyed sample, not necessarily to the entire population.
Sampling error can cause the true population proportion to differ from the sample proportion.
Additional info: These notes cover the foundational concepts of categorical and numerical variables, frequency and relative frequency tables, bar graphs, and logical operations in data categorization, as well as interpretation of graphical data displays. These are essential skills for introductory statistics students.