Skip to main content
Back

Statistics Midterm Study Guidance: Regression, Data Visualization, Probability, and More

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Q1. Multiple Regression with Dummy Variables: Employee Salary Analysis

Background

Topic: Multiple Linear Regression with Dummy Variables

This question tests your understanding of how to build and interpret a multiple regression equation using both a quantitative variable (years of experience) and a categorical variable (gender, coded as a dummy variable).

Employee Salary Analysis Table

Key Terms and Formulas

  • Multiple Regression Equation:

  • Dummy Variable: A variable coded as 0 or 1 to represent categories (here, gender: 0 = female, 1 = male)

  • Response Variable: The variable being predicted (Salary, )

  • Predictor Variables: Experience (), Gender ()

Step-by-Step Guidance

  1. Write the general form of the multiple regression equation using the variables provided: .

  2. Identify the values for a female employee with 10 years of experience: , (since female = 0).

  3. Identify the values for a male employee with 10 years of experience: , (since male = 1).

  4. Substitute the values into the regression equation for each scenario, but do not compute the final salary yet.

  5. To determine if gender has a significant effect, examine the coefficient for the gender variable () and consider its size and statistical significance (if provided).

Try solving on your own before revealing the answer!

Q2. Census in Data Collection

Background

Topic: Types of Data Collection Methods

This question tests your understanding of what a census is in statistics and how it differs from other sampling methods.

Key Terms

  • Census: Collecting data from every individual in the population.

  • Sample: Collecting data from a subset of the population.

Step-by-Step Guidance

  1. Review the definition of a census and compare it to the definitions of sampling methods.

  2. Identify which option describes collecting data from the entire population.

Try solving on your own before revealing the answer!

Q3. Sampling Technique and Bias: Campus Dining Survey

Background

Topic: Sampling Methods and Bias

This question tests your ability to identify sampling techniques and recognize potential sources of bias in survey data collection.

Key Terms

  • Convenience Sampling: Selecting individuals who are easiest to reach.

  • Selection Bias: When the sample is not representative of the population.

Step-by-Step Guidance

  1. Consider where and how the students are being selected (outside the cafeteria at lunchtime).

  2. Determine which sampling method this describes (random, stratified, cluster, or convenience).

  3. Think about whether this method could introduce bias and what kind.

Try solving on your own before revealing the answer!

Q4. Visualizing Annual Rainfall Data

Background

Topic: Choosing Appropriate Graphs for Quantitative Data

This question tests your ability to select the most suitable graph for visualizing time series data (annual rainfall over 100 years).

Key Terms

  • Line Graph: Best for showing trends over time for quantitative data.

  • Bar Graph, Pie Chart, Scatterplot: Used for other types of data or relationships.

Step-by-Step Guidance

  1. Identify the type of data (quantitative, measured over time).

  2. Recall which graph type is best for visualizing trends in time series data.

Try solving on your own before revealing the answer!

Q5. Constructing a Frequency Distribution and Assessing Normality

Background

Topic: Frequency Distributions and Data Normality

This question asks you to organize data into a frequency distribution and consider whether the distribution appears normal.

Key Terms and Steps

  • Frequency Distribution: Table showing how data are distributed across intervals (classes).

  • Normal Distribution: Symmetrical, bell-shaped distribution.

  • Class Width: The difference between the lower limits of consecutive classes.

Step-by-Step Guidance

  1. Start with the lowest value (45 kg) and create intervals of width 10 kg (e.g., 45–54, 55–64, etc.).

  2. Count how many data points fall into each interval and fill out the frequency distribution.

  3. Consider the shape of the distribution (is it symmetric, skewed, or unclear without a histogram?).

Try solving on your own before revealing the answer!

Q6. Interpreting Histograms: Delivery Times

Background

Topic: Histograms and Distribution Shapes

This question tests your ability to match a frequency distribution to its histogram and describe the distribution's shape (normal, uniform, skewed left, skewed right).

Delivery Time Frequency TableHistogram: NormalHistogram: UniformHistogram: Skewed RightHistogram: Skewed Left

Key Terms

  • Normal Distribution: Symmetrical, bell-shaped.

  • Uniform Distribution: All bars are about the same height.

  • Skewed Right: Tail on the right side is longer.

  • Skewed Left: Tail on the left side is longer.

Step-by-Step Guidance

  1. Compare the frequencies in the table to the heights of the bars in each histogram.

  2. Identify which histogram matches the frequency distribution provided.

  3. Describe the shape of the distribution based on the histogram you selected.

Try solving on your own before revealing the answer!

Q18. Interpreting Percentiles from an Ogive

Background

Topic: Percentiles and Ogives

This question asks you to interpret an ogive (cumulative frequency graph) to determine the score corresponding to a given percentile and to explain what that percentile means in context.

Ogive: Goals Scored Percentile

Key Terms

  • Percentile: The value below which a given percentage of observations fall.

  • Ogive: A graph that shows cumulative frequencies or percentiles.

Step-by-Step Guidance

  1. Locate the percentile of interest on the y-axis of the ogive (e.g., 60th percentile).

  2. Draw a horizontal line from the percentile to the curve, then drop a vertical line to the x-axis to find the corresponding score (number of goals).

  3. Interpret what it means for a team to be at or below this percentile in terms of goals scored.

Try solving on your own before revealing the answer!

Q20. Interpreting Boxplots: Water Usage Fees

Background

Topic: Boxplots (Box-and-Whisker Plots) and Comparing Distributions

This question tests your ability to interpret boxplots to compare medians and variability across groups (here, neighborhoods).

Boxplot: Water Usage Fees in Four Neighborhoods

Key Terms

  • Median: The line inside the box, representing the middle value.

  • Interquartile Range (IQR): The width of the box, showing the spread of the middle 50% of data.

Step-by-Step Guidance

  1. Identify the median for each neighborhood by looking at the line inside each box.

  2. Compare the medians to determine which neighborhood has the highest typical water usage fee.

  3. Consider the spread and position of the boxes to support your conclusion.

Try solving on your own before revealing the answer!

Pearson Logo

Study Prep