Statistics Test #1 Review: Step-by-Step Study Guidance

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Q1. What are some benefits of representing data sets using frequency distributions? What are some benefits of using graphs of frequency distributions?

Background

Topic: Frequency Distributions and Data Visualization

This question tests your understanding of why we organize data into frequency distributions and use graphical representations.

Key Terms:

Frequency Distribution: A table that displays the frequency of various outcomes in a sample.
Graph of Frequency Distribution: A visual representation (like a histogram or bar graph) showing how often each value or range of values occurs.

Step-by-Step Guidance

Consider how a frequency distribution organizes raw data into a more readable format, making it easier to spot patterns or trends.
Think about how graphs can make it even easier to see these patterns visually, such as identifying the most common values or the shape of the data distribution.
Reflect on how both methods help in summarizing large data sets and making comparisons between groups or categories.

Try explaining these benefits in your own words before checking the answer!

Q2. What is the difference between class limits and class boundaries?

Background

Topic: Frequency Distribution Construction

This question tests your understanding of how classes are defined in frequency distributions, especially for grouped data.

Key Terms:

Class Limits: The smallest and largest data values that can belong to a class.
Class Boundaries: The values that separate classes without leaving gaps, often adjusted by 0.5 for integer data.

Step-by-Step Guidance

Identify the lower and upper class limits for a given class (e.g., 10-14: lower limit is 10, upper limit is 14).
Determine the class boundaries by adjusting the limits to avoid gaps (for integer data, subtract 0.5 from the lower limit and add 0.5 to the upper limit).
Compare how class limits and class boundaries are used in constructing histograms and other graphs.

Try defining both terms and their differences before checking the answer!

Q3. What is the difference between a frequency polygon and an ogive?

Background

Topic: Graphical Representation of Data

This question tests your knowledge of different types of graphs used to display frequency data.

Key Terms:

Frequency Polygon: A line graph that shows the frequencies of classes using midpoints.
Ogive: A line graph that displays cumulative frequencies.

Step-by-Step Guidance

Recall how a frequency polygon is constructed by plotting class midpoints against class frequencies and connecting the points with straight lines.
Remember that an ogive plots cumulative frequencies against the upper class boundaries, showing how totals accumulate across classes.
Think about what each graph is best used for: frequency polygons for comparing distributions, ogives for determining medians and percentiles.

Try describing the differences in your own words before checking the answer!

Q4. Determine whether the statement is true or false. If it is false, rewrite it as a true statement. "When each data class has the same frequency, the distribution is symmetric."

Background

Topic: Symmetry in Distributions

This question tests your understanding of what it means for a distribution to be symmetric.

Key Terms:

Symmetric Distribution: A distribution where the left and right sides are mirror images.
Frequency: The number of data points in each class.

Step-by-Step Guidance

Consider what it means for all classes to have the same frequency (a uniform distribution).
Think about whether a uniform distribution is always symmetric, and if the statement as written is accurate.
If the statement is false, try to rewrite it so it is true.

Try determining the truth of the statement before checking the answer!

Q5. Construct the described data set. The entries in the data set cannot all be the same. The median and the mode are the same. Data set should have at least 5 data points.

Background

Topic: Measures of Central Tendency

This question tests your ability to construct a data set with specific properties for the median and mode.

Key Terms:

Median: The middle value when data are ordered.
Mode: The value that appears most frequently.

Step-by-Step Guidance

Start by choosing a value to be the mode (it must appear more than once).
Arrange at least 5 data points so that the mode is also the median (the middle value in the ordered list).
Ensure not all values are the same, and check that the median and mode match.

Try constructing such a data set before checking the answer!

Q6. What is the difference between relative frequency and cumulative frequency?

Background

Topic: Frequency Distributions

This question tests your understanding of how to summarize data using different types of frequencies.

Key Terms:

Relative Frequency: The proportion or percentage of data values in a class.
Cumulative Frequency: The sum of frequencies for a class and all previous classes.

Step-by-Step Guidance

Recall how to calculate relative frequency:
Recall how to calculate cumulative frequency: add the frequency of the current class to the sum of all previous class frequencies.
Think about what each measure tells you about the data set.

Try explaining the difference in your own words before checking the answer!

Q7. The number of credits being taken by a sample of 13 full-time college students are listed below. Find the mean, median, and mode of the data, if possible. If any measure cannot be found or does not represent the center of the data, explain why.

Background

Topic: Measures of Central Tendency

This question tests your ability to compute and interpret the mean, median, and mode for a data set.

Key Terms and Formulas:

Mean:
Median: The middle value when data are ordered.
Mode: The value that appears most frequently.

Step-by-Step Guidance

Order the data from smallest to largest.
Calculate the mean by summing all values and dividing by the number of data points.
Find the median by locating the middle value (since there are 13 data points, the 7th value in the ordered list).
Identify the mode by finding the value(s) that appear most frequently.
Consider whether each measure represents the center of the data and explain why or why not.

Try calculating each measure before checking the answer!

Q8. What must be known about a data set before the Empirical Rule can be used?

Background

Topic: Empirical Rule (68-95-99.7 Rule)

This question tests your understanding of the conditions required to apply the Empirical Rule to a data set.

Key Terms:

Empirical Rule: Describes the spread of data in a normal (bell-shaped) distribution.
Normal Distribution: A symmetric, bell-shaped distribution.

Step-by-Step Guidance

Recall that the Empirical Rule applies only to distributions that are approximately normal (symmetric and bell-shaped).
Think about how to check if a data set meets these criteria (e.g., by looking at a histogram or using statistical tests).

Try stating the necessary condition before checking the answer!

Q9. The length of a guest lecturer's talk represents the third quartile for talks in a guest lecture series. Make an observation about the length of the talk.

Background

Topic: Quartiles and Data Interpretation

This question tests your understanding of what it means for a value to be at the third quartile (Q3).

Key Terms:

Third Quartile (Q3): The value below which 75% of the data fall.

Step-by-Step Guidance

Recall the definition of Q3 and what it represents in a data set.
Think about how the length of the talk compares to the rest of the talks in the series.

Try making an observation before checking the answer!

Q10. Explain how the interquartile range of a data set can be used to identify outliers.

Background

Topic: Outlier Detection Using IQR

This question tests your understanding of how to use the interquartile range (IQR) to find outliers.

Key Terms and Formulas:

Interquartile Range (IQR):
Outlier: A data value that is much higher or lower than most other values.

Step-by-Step Guidance

Recall the formula for IQR:
Remember the rule for identifying outliers: any value greater than or less than is considered an outlier.
Think about how to apply this rule to a data set to check for outliers.

Try explaining the process before checking the answer!

Q11. On a box-and-whisker plot, one quarter of a data set lies on the left whisker. True or False? If false, explain.

Background

Topic: Box-and-Whisker Plots

This question tests your understanding of how data are distributed in a boxplot.

Key Terms:

Box-and-Whisker Plot: A graphical summary of a data set showing quartiles and extremes.
Whisker: The lines extending from the box to the minimum and maximum values (excluding outliers).

Step-by-Step Guidance

Recall how the data are divided into quartiles in a boxplot.
Think about what portion of the data each whisker represents.
Decide if the statement is true or false, and if false, explain why.

Try reasoning through the statement before checking the answer!

Q12. Use the accompanying data set to complete the following actions. a. Find the quartiles. b. Find the interquartile range. c. Identify any outliers. Data: 64, 60, 57, 57, 57, 64, 61, 62, 58, 54, 60, 61, 57, 63, 78

Background

Topic: Quartiles, IQR, and Outlier Detection

This question tests your ability to compute quartiles, the interquartile range, and identify outliers in a data set.

Key Terms and Formulas:

Quartiles: , (median),
Interquartile Range (IQR):
Outlier Rule: Outliers are values less than or greater than

Step-by-Step Guidance

Order the data from smallest to largest.
Find , , and using the appropriate methods for quartiles.
Calculate the IQR:
Apply the outlier rule to check for any values outside the acceptable range.

Try working through each part before checking the answer!

Q13. Why should the number of classes in a frequency distribution be between 5 and 20?

Background

Topic: Frequency Distribution Construction

This question tests your understanding of how the number of classes affects the usefulness of a frequency distribution.

Key Terms:

Class: A grouping of data values in a frequency distribution.

Step-by-Step Guidance

Consider what happens if there are too few classes (data are oversimplified).
Consider what happens if there are too many classes (data are overcomplicated).
Think about why 5 to 20 is a recommended range for most data sets.

Try explaining the reasoning before checking the answer!

Q14. What is the difference between class limits and class boundaries?

Background

Topic: Frequency Distribution Construction

This is a repeat of Q2, so review your understanding of class limits and boundaries.

Key Terms:

Class Limits and Class Boundaries (see Q2 for definitions).

Step-by-Step Guidance

Review the definitions and think about how they are used in constructing frequency tables and histograms.

Try defining both terms before checking the answer!

Q15. After constructing an expanded frequency distribution, what should the sum of the relative frequencies be?

Background

Topic: Relative Frequency

This question tests your understanding of how relative frequencies relate to the total data set.

Key Terms and Formulas:

Relative Frequency:

Step-by-Step Guidance

Recall that relative frequencies are proportions or percentages of the total.
Think about what the sum of all relative frequencies should be for a complete data set.

Try stating the sum before checking the answer!

Q16. Determine whether the statement is true or false. If it is false, rewrite it as a true statement. "In a frequency distribution, the class width is the distance between the lower and upper limits of a class."

Background

Topic: Class Width in Frequency Distributions

This question tests your understanding of how class width is defined.

Key Terms and Formulas:

Class Width:

Step-by-Step Guidance

Recall the correct formula for class width.
Compare the statement to the correct definition and decide if it is true or false.
If false, rewrite the statement to make it true.

Try determining the truth of the statement before checking the answer!

Q17. Determine whether the statement is true or false. If it is false, rewrite it as a true statement. "A graph of the cumulative frequencies can decrease from left to right."

Background

Topic: Cumulative Frequency Graphs (Ogives)

This question tests your understanding of how cumulative frequencies behave.

Key Terms:

Cumulative Frequency: The running total of frequencies up to a certain class.

Step-by-Step Guidance

Recall how cumulative frequencies are calculated and plotted.
Think about whether it is possible for the cumulative frequency to decrease as you move to higher classes.
If the statement is false, rewrite it to be true.

Try reasoning through the statement before checking the answer!

Q18. The mean is the measure of central tendency most likely to be affected by an outlier. True or False?

Background

Topic: Measures of Central Tendency and Outliers

This question tests your understanding of how outliers affect different measures of center.

Key Terms:

Mean: The arithmetic average.
Outlier: An extreme value in the data set.

Step-by-Step Guidance

Recall how the mean is calculated and how a very large or small value can affect it.
Compare this to how the median and mode are affected by outliers.
Decide if the statement is true or false.

Try reasoning through the statement before checking the answer!

Q19. When each data class has the same frequency, the distribution is symmetric. True or False?

Background

Topic: Symmetry in Distributions

This is a repeat of Q4, so review your understanding of symmetry and uniform distributions.

Key Terms:

Symmetric Distribution and Uniform Distribution (see Q4 for definitions).

Step-by-Step Guidance

Review the definitions and think about whether the statement is always true.

Try reasoning through the statement before checking the answer!

Q20. Construct the described data set. The entries in the data set cannot all be the same. The mean and median are the same and the data set is bimodal.

Background

Topic: Measures of Central Tendency and Bimodal Data

This question tests your ability to construct a data set with specific properties for the mean, median, and mode.

Key Terms:

Mean: The arithmetic average.
Median: The middle value.
Bimodal: A data set with two modes (values that appear most frequently).

Step-by-Step Guidance

Choose two values to be the modes (each must appear more than once).
Arrange the data so that the mean and median are equal (the data should be symmetric around the center).
Ensure the data set is not all the same value and is at least 5 data points.

Try constructing such a data set before checking the answer!

Q21. The number of credits being taken by a sample of 13 full-time college students are listed below. Find the mean, median, and mode of the data, if possible. If any measure cannot be found or does not represent the center of the data, explain why.

Background

Topic: Measures of Central Tendency

This is a repeat of Q7, so review your process for finding mean, median, and mode.

Key Terms and Formulas:

Mean, Median, Mode (see Q7 for definitions and formulas).

Step-by-Step Guidance

Order the data and calculate each measure as in Q7.

Try calculating each measure before checking the answer!

Q22. A student receives the following grades, with an A worth 4 points, a B worth 3 points, a C worth 2 points, and a D worth 1 point. What is the student's weighted mean grade point score? B in 2 two-credit classes D in 1 four-credit class A in 1 three-credit class C in 1 three-credit class

Background

Topic: Weighted Mean

This question tests your ability to calculate a weighted mean, where each value has a different weight (credit hours).

Key Terms and Formulas:

Weighted Mean:
Where is the grade point and is the number of credits for each class.

Step-by-Step Guidance

List each grade and its corresponding credit hours.
Multiply each grade point by its credit hours to get the weighted value.
Sum all the weighted values and all the credit hours.
Divide the total weighted value by the total credit hours to get the weighted mean.

Try setting up the calculation before checking the answer!

Q23. The gas mileages (in miles per gallon) for 31 cars are shown in the frequency distribution. Approximate the mean of the frequency distribution. 26-29: 10 30-33: 13 34-37: 2 38-41: 6

Background

Topic: Mean of a Frequency Distribution

This question tests your ability to estimate the mean from grouped data using class midpoints and frequencies.

Key Terms and Formulas:

Class Midpoint:
Mean of Frequency Distribution:
Where is the frequency and is the class midpoint.

Step-by-Step Guidance

Find the midpoint for each class interval.
Multiply each midpoint by its class frequency.
Sum all the products and divide by the total number of data points (sum of frequencies).

Try setting up the calculation before checking the answer!

Q24. What must be known about a data set before the Empirical Rule can be used?

Background

Topic: Empirical Rule

This is a repeat of Q8, so review your understanding of the conditions for using the Empirical Rule.

Key Terms:

Empirical Rule (see Q8 for definition).

Step-by-Step Guidance

Recall the necessary condition for applying the Empirical Rule.

Try stating the condition before checking the answer!

Q25. You are applying for a job at two companies. Company A offers starting salaries with \sigma = \mu = $25,000 and $\sigma = $3,000. From which company are you more likely to get an offer of $27,000 or more? Explain.

Background

Topic: Standard Deviation and Probability

This question tests your understanding of how standard deviation affects the likelihood of extreme values in a normal distribution.

Key Terms and Formulas:

Mean (): The average value.
Standard Deviation (): A measure of spread.
Z-score:

Step-by-Step Guidance

Calculate the z-score for z = \frac{27,000 - 25,000}{\sigma}$.
Compare the z-scores to see which company makes less unusual (closer to the mean).
Interpret which company is more likely to offer or more based on the z-scores.

Try calculating the z-scores before checking the answer!

Q26. A student's grade on the Fundamentals of Engineering exam has a z-score of -0.5. Make an observation about the student's grade.

Background

Topic: Z-scores and Data Interpretation

This question tests your understanding of what a z-score tells you about a data value's position relative to the mean.

Key Terms and Formulas:

Z-score:

Step-by-Step Guidance

Recall that a z-score indicates how many standard deviations a value is from the mean.
Interpret what a negative z-score means (the value is below the mean).
Think about how far below the mean a z-score of -0.5 is.