BackMA 113 Alternate Midterm 1: Statistics Study Guidance
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Q1(a). What is the level of measurement of the variable of interest?
Background
Topic: Levels of Measurement
This question tests your understanding of the different levels of measurement (nominal, ordinal, interval, ratio) and how to classify a variable based on its properties.
Key Terms:
Nominal: Categories with no inherent order (e.g., blood type).
Ordinal: Categories with a meaningful order, but differences between categories are not meaningful (e.g., letter grades).
Interval: Ordered categories with meaningful differences, but no true zero (e.g., temperature in Celsius).
Ratio: Like interval, but with a true zero (e.g., height, weight).
Step-by-Step Guidance
Identify the variable of interest in the data (the letter grades: A, B, C, D).
Consider whether the grades have a natural order (A is better than B, etc.).
Think about whether the differences between grades are meaningful and if there is a true zero point.
Try solving on your own before revealing the answer!
Q1(b). Fill in the frequency and relative frequency table for the data above:
Background
Topic: Frequency and Relative Frequency Tables
This question tests your ability to summarize categorical data using frequency (counts) and relative frequency (proportions).
Key Terms and Formulas:
Frequency: The number of times a category appears in the data.
Relative Frequency: The proportion of the total that each category represents.
Relative Frequency Formula:
Step-by-Step Guidance
Count how many times each grade (A, B, C, D) appears in the list.
Write these counts in the 'Frequency' column of the table.
Calculate the total number of students (should match the sum of frequencies).
For each grade, divide its frequency by the total number of students to get the relative frequency.
Fill in the 'Relative Frequency' column with these values.
Try solving on your own before revealing the answer!
Q1(c). What is the population of this study?
Background
Topic: Populations and Samples
This question tests your understanding of what constitutes the population in a statistical study.
Key Terms:
Population: The entire group of individuals or items that is the subject of the study.
Sample: A subset of the population.
Step-by-Step Guidance
Identify who or what is being studied (students in the course).
Determine whether the data includes all individuals of interest or just a subset.
Try solving on your own before revealing the answer!
Q1(d). Is the relative frequency of As in the teacher’s course a statistic or parameter? Why?
Background
Topic: Statistics vs. Parameters
This question tests your ability to distinguish between a statistic (sample measure) and a parameter (population measure).
Key Terms:
Statistic: A numerical summary of a sample.
Parameter: A numerical summary of a population.
Step-by-Step Guidance
Recall the definitions of statistic and parameter.
Decide whether the data represents the entire population or just a sample.
Based on your answer, determine if the relative frequency is a statistic or parameter, and explain why.
Try solving on your own before revealing the answer!
Q1(e). Is there any sampling error in this study?
Background
Topic: Sampling Error
This question tests your understanding of what sampling error is and when it occurs.
Key Terms:
Sampling Error: The difference between a sample statistic and the actual population parameter, due to the fact that the sample is not the entire population.
Step-by-Step Guidance
Consider whether the data includes the entire population or just a sample.
Recall that sampling error only occurs when a sample is used to estimate a population parameter.
Try solving on your own before revealing the answer!
Q1(f). Find the median of this dataset.
Background
Topic: Median of Qualitative Data
This question tests your ability to find the median in a dataset of ordered categorical (qualitative) data.
Key Terms and Steps:
Median: The middle value when the data is ordered.
For an odd number of data points, the median is the value at position .
Step-by-Step Guidance
List all the grades in order from lowest to highest (D, C, B, A) or highest to lowest, depending on convention.
Count the total number of grades (should be 15).
Find the position of the median using , where is the number of data points.
Identify the grade at that position in your ordered list.
Try solving on your own before revealing the answer!
Q2(a). Is this an observational study or a designed experiment?
Background
Topic: Types of Studies
This question tests your ability to distinguish between observational studies and designed experiments.
Key Terms:
Observational Study: Researchers observe outcomes without assigning treatments.
Designed Experiment: Researchers assign treatments to subjects and observe the effects.
Step-by-Step Guidance
Consider whether the researchers assigned any treatments or simply recorded data as it occurred.
Decide which type of study this is based on the description.
Try solving on your own before revealing the answer!
Q2(b). What are the explanatory and response variables?
Background
Topic: Variables in Studies
This question tests your ability to identify explanatory (independent) and response (dependent) variables in a study.
Key Terms:
Explanatory Variable: The variable that is manipulated or categorized to see its effect.
Response Variable: The outcome measured in the study.
Step-by-Step Guidance
Review the variables collected: cases of Lyme disease and drowning deaths.
Determine which variable is being used to explain or predict the other.
Try solving on your own before revealing the answer!
Q2(c). Explain in two sentences why month is a confounding variable.
Background
Topic: Confounding Variables
This question tests your understanding of confounding variables and their impact on studies.
Key Terms:
Confounding Variable: A variable that influences both the explanatory and response variables, potentially distorting the observed relationship.
Step-by-Step Guidance
Think about how the month of the year could affect both Lyme disease cases and drowning deaths.
Formulate two sentences explaining how month could be related to both variables and why this is a problem for interpreting the results.
Try solving on your own before revealing the answer!
Q2(d). Calculate the population median number of cases of Lyme disease.
Background
Topic: Median Calculation
This question tests your ability to find the median in a quantitative dataset.
Key Steps:
Median: The middle value when the data is ordered from smallest to largest.
For an even number of data points, the median is the average of the two middle values.
Step-by-Step Guidance
List all the Lyme disease case counts in order from smallest to largest.
Count the total number of months (should be 12).
Find the two middle positions: and .
Identify the values at those positions and calculate their average.
Try solving on your own before revealing the answer!
Q2(e). Calculate the population mean number of cases of Lyme disease.
Background
Topic: Mean Calculation
This question tests your ability to calculate the mean (average) of a dataset.
Key Formula:
= value for each month
= number of months
Step-by-Step Guidance
Add up all the Lyme disease case counts for each month to get .
Count the total number of months ().
Divide the total sum by the number of months to get the mean.
Try solving on your own before revealing the answer!
Q2(f). Calculate the population standard deviation of the number of cases of Lyme disease.
Background
Topic: Population Standard Deviation
This question tests your ability to calculate the standard deviation for a population dataset.
Key Formula:
= value for each month
= population mean
= number of months
Step-by-Step Guidance
Calculate the mean () of the Lyme disease cases (from part e).
For each month, subtract the mean from the case count to get the deviation: .
Square each deviation to get .
Add up all the squared deviations to get .
Divide this sum by (the number of months), then take the square root to find the standard deviation.
Try solving on your own before revealing the answer!
Q2(g). Given , , and , find the equation of the least squares regression line.
Background
Topic: Least Squares Regression Line
This question tests your ability to construct the equation of the regression line using summary statistics.
Key Formula:
The regression line is
Step-by-Step Guidance
Identify the given values: (mean of y), (standard deviation of y), (correlation coefficient).
Find (mean of x) and (standard deviation of x) from the data or table (if provided).
Calculate the slope using .
Calculate the intercept using .
Write the regression equation .
Try solving on your own before revealing the answer!
Q2(h). Why might it be acceptable to use the regression line from the previous part to estimate the expected number of drownings in December, but not July?
Background
Topic: Extrapolation and Interpolation in Regression
This question tests your understanding of when it is appropriate to use a regression line for prediction.
Key Terms:
Interpolation: Predicting within the range of observed data.
Extrapolation: Predicting outside the range of observed data, which can be unreliable.
Step-by-Step Guidance
Consider the range of Lyme disease cases observed in the original data for July and December.
Think about whether 40 cases (July) and 2 cases (December) fall within or outside the original data range.
Explain why using the regression line for values within the observed range (interpolation) is more reliable than for values outside the range (extrapolation).