Skip to main content
Back

Statistics Unit I Test Study Guide: Data, Distributions, and Descriptive Statistics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Q1. How do you identify the individuals (observational units) and variables in a set of data?

Background

Topic: Data Basics

This question tests your understanding of the fundamental components of a data set: the individuals (or observational units) and the variables measured on them.

Key Terms:

  • Individuals (Observational Units): The objects described by a set of data (people, animals, things, etc.).

  • Variable: Any characteristic of an individual. A variable can take different values for different individuals.

Step-by-Step Guidance

  1. Read the context or description of the data set carefully to determine what or who is being observed or measured. These are your individuals.

  2. Identify the characteristics or properties that are being recorded for each individual. These are your variables.

  3. Check if the variables are clearly labeled or described (e.g., height, gender, test score).

  4. Make sure you can distinguish between the individuals and the variables in any given example.

Try identifying individuals and variables in a sample data set before checking your answer!

Q2. How do you classify each variable as categorical (binary or not) or quantitative (discrete or continuous), and identify the units for quantitative variables?

Background

Topic: Types of Variables

This question is about distinguishing between different types of variables and recognizing the measurement units for quantitative variables.

Key Terms:

  • Categorical Variable: Places an individual into one of several groups or categories. Binary means only two categories.

  • Quantitative Variable: Takes numerical values for which arithmetic operations make sense. Can be discrete (countable) or continuous (measurable).

  • Units: The standard of measurement for quantitative variables (e.g., feet, inches, dollars).

Step-by-Step Guidance

  1. For each variable, ask: Does it describe a category or a number?

  2. If it describes a group or label (e.g., color, type), it's categorical. If it describes a number (e.g., height, age), it's quantitative.

  3. For categorical variables, check if there are only two categories (binary) or more.

  4. For quantitative variables, determine if the values are countable (discrete) or can take any value in an interval (continuous).

  5. Identify the units for each quantitative variable (e.g., inches, years, dollars).

Try classifying variables and identifying units in a sample data set!

Q3. How do you make and interpret a bar graph of the distribution of a categorical variable?

Background

Topic: Displaying Categorical Data

This question focuses on constructing and interpreting bar graphs to display the distribution of a categorical variable.

Key Terms and Concepts:

  • Bar Graph: A graph that represents the frequency or relative frequency of categories using bars.

  • Frequency: The count of individuals in each category.

  • Relative Frequency: The proportion or percentage of individuals in each category.

Step-by-Step Guidance

  1. List all categories of the variable along the horizontal axis.

  2. Determine the frequency or relative frequency for each category.

  3. Draw bars for each category, with heights corresponding to their frequencies or relative frequencies.

  4. Interpret the graph by comparing the heights of the bars to see which categories are most or least common.

Try making and interpreting a bar graph for a sample categorical variable!

Q4. How do you represent categorical data using frequency or relative frequency tables?

Background

Topic: Summarizing Categorical Data

This question is about organizing categorical data into tables for easier analysis.

Key Terms:

  • Frequency Table: Lists each category and the number of individuals in each.

  • Relative Frequency Table: Lists each category and the proportion or percentage of individuals in each.

Step-by-Step Guidance

  1. List all possible categories in one column.

  2. Count the number of individuals in each category and record in the next column (frequency).

  3. Calculate the relative frequency for each category by dividing the frequency by the total number of observations.

  4. Express relative frequency as a decimal or percentage.

Try creating a frequency and relative frequency table for a sample data set!

Q5. How do you make and interpret dotplots, stemplots, histograms, and boxplots for quantitative data?

Background

Topic: Displaying Quantitative Data

This question covers several graphical methods for displaying the distribution of quantitative variables.

Key Terms and Concepts:

  • Dotplot: Dots represent individual data points along a number line.

  • Stemplot: Data are split into stems (leading digits) and leaves (trailing digits).

  • Histogram: Bars represent the frequency of data within intervals (bins).

  • Boxplot: Visualizes the five-number summary (min, Q1, median, Q3, max).

Step-by-Step Guidance

  1. Choose the appropriate graph based on the size and range of your data set.

  2. For dotplots and stemplots, plot each data point or split data into stems and leaves.

  3. For histograms, divide the data into intervals and count the number of data points in each interval.

  4. For boxplots, calculate the five-number summary and draw the box and whiskers accordingly.

  5. Interpret the graphs by looking for patterns, clusters, gaps, and outliers.

Try constructing and interpreting these graphs for a sample data set!

Q6. How do you describe the shape, center, spread, and outliers (SOCS) of a distribution?

Background

Topic: Describing Distributions

This question is about summarizing the main features of a distribution using SOCS: Shape, Outliers, Center, and Spread.

Key Terms:

  • Shape: Symmetric, skewed right, skewed left, unimodal, bimodal, uniform.

  • Center: Mean or median.

  • Spread: Range, interquartile range (IQR), standard deviation.

  • Outliers: Unusually high or low values, identified visually or by rules (e.g., 1.5 x IQR rule).

Step-by-Step Guidance

  1. Examine the graph to determine the shape (symmetry, skewness, modality).

  2. Calculate or identify the center (mean or median) of the distribution.

  3. Calculate or identify the spread (range, IQR, or standard deviation).

  4. Look for outliers visually or use the 1.5 x IQR rule to check for outliers.

  5. Summarize your findings in context, using comparative language if comparing distributions.

Try describing SOCS for a sample distribution!

Q7. How do you find the mean and median of a set of observations?

Background

Topic: Measures of Center

This question is about calculating the mean and median, and understanding their properties.

Key Formulas:

  • Mean:

  • Median: The middle value when data are ordered; if even number of observations, average the two middle values.

Step-by-Step Guidance

  1. Order the data from smallest to largest.

  2. For the mean, add all the values and divide by the number of observations.

  3. For the median, find the middle value (or average the two middle values if the data set is even).

  4. Compare the mean and median, especially if the data are skewed or have outliers.

Try calculating the mean and median for a sample data set!

Q8. How do you find the five-number summary and draw a boxplot?

Background

Topic: Measures of Spread and Graphical Representation

This question is about summarizing data using the five-number summary and visualizing it with a boxplot.

Key Terms and Formulas:

  • Five-number summary: Minimum, Q1 (first quartile), Median, Q3 (third quartile), Maximum.

  • Boxplot: A graphical display of the five-number summary.

Step-by-Step Guidance

  1. Order the data from smallest to largest.

  2. Identify the minimum and maximum values.

  3. Find the median (middle value).

  4. Find Q1 (median of the lower half) and Q3 (median of the upper half).

  5. Draw a box from Q1 to Q3, with a line at the median, and whiskers to the min and max.

Try finding the five-number summary and drawing a boxplot for a sample data set!

Q9. How do you calculate the interquartile range (IQR) and use it to identify outliers?

Background

Topic: Spread and Outlier Detection

This question is about measuring spread with the IQR and using it to detect outliers.

Key Formulas:

  • IQR:

  • Outlier Rule: An observation is an outlier if it is below or above

Step-by-Step Guidance

  1. Find Q1 and Q3 (see previous question).

  2. Calculate the IQR by subtracting Q1 from Q3.

  3. Compute and .

  4. Check which data points, if any, fall outside these bounds.

Try calculating the IQR and identifying outliers for a sample data set!

Q10. How do you calculate and interpret a z-score?

Background

Topic: Standardization and Normal Distributions

This question is about finding the standardized score (z-score) for a value and interpreting its meaning.

Key Formula:

  • Where is the value, is the mean, and is the standard deviation.

Step-by-Step Guidance

  1. Identify the value , the mean , and the standard deviation .

  2. Subtract the mean from the value: .

  3. Divide the result by the standard deviation: .

  4. Interpret the z-score: positive means above the mean, negative means below the mean; the magnitude tells you how many standard deviations away from the mean the value is.

Try calculating and interpreting a z-score for a sample value!

Pearson Logo

Study Prep