BackDescribing Data with Tables and Graphs: Structured Study Notes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Describing Data with Tables and Graphs
Organizing Categorical Data
Categorical data refers to variables that can be divided into groups or categories, such as symptoms, blood types, or survey responses. Organizing this data helps us summarize and compare groups efficiently.
Frequency Table: Lists each category and the number of observations in that category.
Relative Frequency Table: Adds a column showing the proportion of observations in each category, calculated as the category's count divided by the total count.
Two-way (Contingency) Table: Displays the frequency for each combination of two categorical variables.
Example: A clinic records the primary symptom reported by patients. The frequency table might look like:
Symptom | Frequency | Relative Frequency |
|---|---|---|
Back pain | 8 | 0.267 |
Fatigue | 8 | 0.267 |
Headache | 7 | 0.233 |
Nausea | 5 | 0.167 |
Two-way Table Example: Symptoms by age group:
Age Group | Back pain | Fatigue | Headache | Nausea |
|---|---|---|---|---|
0-19 | 2 | 2 | 1 | 1 |
20-59 | 5 | 5 | 5 | 3 |
60+ | 1 | 1 | 1 | 1 |
Additional info: Two-way tables are essential for examining relationships between categorical variables, such as age and symptom type.
Bar Charts and Pie Charts
Bar charts and pie charts are graphical methods for displaying categorical data. Each has distinct uses and design considerations.
Bar Chart: Uses bars to represent the frequency or proportion of each category. Bars are separated to emphasize that categories are discrete.
Pie Chart: Divides a circle into slices, each representing a category's proportion of the total. Best for showing parts of a whole.
Example: Distribution of blood types among donors:
Blood Type | Count | Proportion |
|---|---|---|
O | 46 | 0.460 |
A | 36 | 0.360 |
B | 18 | 0.180 |
Design Tips for Bar Charts:
Start axis at zero.
Use equal bar widths and clear labels.
Avoid unnecessary decoration.
When to Use Bar vs. Pie Charts:
Bar charts: Compare categories, especially when there are many or when exact values matter.
Pie charts: Show proportions of a whole, best with few categories.
Clustered and Stacked Bar Charts: Useful for comparing multiple categorical variables. Clustered bars group categories side by side; stacked bars show composition within each group.
Organizing Quantitative Data
Quantitative data consists of numerical values that can be measured or counted. Organizing this data helps reveal patterns and distributions.
Frequency Distribution: Table that divides the range of data into intervals (classes) and shows the count for each interval.
Class Width: The difference between consecutive lower class limits.
Relative Frequency: Proportion of observations in a class.
Cumulative Frequency: Running total of frequencies up to a given class.
Example: Frequency distribution of cavities per patient:
Class | Frequency | Relative Frequency | Cumulative Frequency |
|---|---|---|---|
0-1 | 3 | 0.25 | 0.25 |
2-3 | 4 | 0.33 | 0.58 |
4-5 | 2 | 0.17 | 0.75 |
6-7 | 3 | 0.25 | 1.00 |
Ogive: A graph of cumulative frequency or cumulative relative frequency versus class boundaries.
Histograms, Stem-and-Leaf Plots, and Dotplots
These graphical methods are used to visualize the distribution of quantitative data.
Histogram: Uses adjacent rectangles to represent classes of a quantitative variable. The area of each bar is proportional to frequency.
Stem-and-Leaf Plot: Displays actual data values, split into "stems" (leading digits) and "leaves" (final digits).
Dotplot: Places a dot above a number line for each observation, useful for small datasets or showing clusters.
Example: Distribution of resting heart rates among individuals:
Histogram shows the shape, center, and spread of the data.
Stem-and-leaf plot allows you to see individual values and their distribution.
Dotplot highlights clusters and outliers.
Choosing the Right Display: Use histograms for large samples, stem-and-leaf plots for moderate samples, and dotplots for small samples or when you want to see individual values.
Key Formulas
Relative Frequency:
Cumulative Frequency:
Class Width:
Recap Table
Keyword/Concept | Definition or Note |
|---|---|
Frequency table | Lists each category and the number of observations in that category. |
Relative frequency table | Shows the proportion of observations in each category. |
Two-way table | Displays frequencies for combinations of two categorical variables. |
Bar chart | Displays categories using bars; bar length represents frequency or proportion. |
Pie chart | Shows proportions of a whole using slices. |
Histogram | Graph of adjacent rectangles for classes of a quantitative variable. |
Stem-and-leaf plot | Displays actual data values split into stems and leaves. |
Dotplot | Places dots above a number line for each observation. |
Ogive | Graph of cumulative frequency or cumulative relative frequency. |
Additional info:
These notes cover the essential methods for describing and visualizing both categorical and quantitative data, including tables and graphical displays. Understanding these concepts is foundational for further statistical analysis.