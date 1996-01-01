- 1. Intro to Stats and Collecting Data(0)
- 2. Describing Data with Tables and Graphs(0)
- 3. Describing Data Numerically(0)
- 4. Probability(0)
- 5. Binomial Distribution & Discrete Random Variables(0)
- 6. Normal Distribution and Continuous Random Variables(0)
- 7. Sampling Distributions & Confidence Intervals: Mean(0)
- 8. Sampling Distributions & Confidence Intervals: Proportion(0)
- 9. Hypothesis Testing for One Sample(0)
- 10. Hypothesis Testing for Two Samples(0)
- 11. Correlation(0)
- 12. Regression(0)
- 13. Chi-Square Tests & Goodness of Fit(0)
- 14. ANOVA(0)
Intro to Stats: Videos & Practice Problems
Intro to Stats Practice Problems
Identify the level of measurement of the listed newborn heights: nominal, ordinal, interval, or ratio.
Given that the listed newborn heights are part of a larger collection of newborn height measurements, do the data constitute a sample or a population?
Listed below are measured weights (g) of apples in a supermarket labeled as weighing 150 g each.
49.8, 150.3, 151.2, 148.7, 149.5, 150.1
Are these weights a sample or a population, considering that they come from apples chosen from a much larger population?
For a study, a statistics instructor collects hair color data from her students. What is the name of this type of sample?
The number of students enrolled in a mathematics course at different high schools is listed below:
35, 42, 28, 50, 47, 39, 31, 45
i. Determine the level of measurement of the data: nominal, ordinal, interval, or ratio.
ii. Are the data discrete or continuous?
The following are recorded temperatures (in °F) at noon in different cities on a particular day:
Are the temperature values discrete or continuous data?
The following data represent the battery life (in hours) of randomly selected laptop models from a consumer electronics survey. Use these values to answer the questions below.
Battery Life (hours):
i. Determine the level of measurement of these data (nominal, ordinal, interval, or ratio).
ii. Are the original, unrounded battery life durations classified as continuous or discrete data?
The data below represents the number of reported shark attacks worldwide per year over a sequence of recent consecutive years. Analyze the data to answer the following questions.
Shark Attacks (per year):
i. What significant aspect of the data is not captured by statistical calculations alone?
ii. What tool or method would be useful in identifying this aspect?
iii. What observations can be made from a quick visual inspection of the data?
A study was conducted where the number of hours students spent studying was recorded and paired with their scores on a mathematics test. In this sample of paired data, what do r and ρ (rho) represent?
A researcher conducted an experiment in which he recorded the number of daily exercise minutes for a group of students and matched these values with their scores on a physical fitness test.
Without performing any calculations or research, estimate the value of r (the correlation coefficient).
A researcher conducted a study in which the daily calorie intake of students was recorded and matched with their scores on a physical endurance test. If the calorie intake values are converted from kilocalories to joules, does the correlation coefficient r change? Give reason.
A researcher recorded the number of hours students spent practicing a musical instrument and matched these values with their scores on a music proficiency test. If we find that the correlation coefficient r = 0, what does this indicate in above situation?
Listed below are marathon completion times (in minutes) for elite male and female runners in recent international marathons, matched by the same event year.
Find the best predicted completion time for an elite female runner given that an elite male runner completes a marathon in 140 minutes. How does the predicted time compare to the actual female runner’s time of 155 minutes?
The table below shows the number of hours students studied for a final exam and their corresponding exam scores (out of 100). Find the best-predicted exam score for a student who studied for 9 hours.
Listed below are the annual numbers of pastry shops opened (in hundreds) and the number of goals scored in the World Cup final that same year. What is the best-predicted number of goals in a year with 84 pastry shops opened? How close is the predicted value to the actual result of 3 goals?
Find the regression equation, letting the first variable be the predictor (x) variable. Then determine the indicated predicted value. Use the data from 10 hair salon visits listed below. Suppose the appointment time is 75 minutes. How does the predicted tip compare to the actual tip of $12.50?
The table below lists the dinner durations (in minutes) and the corresponding tips (in dollars) received by a restaurant over evenings. Find the best-predicted tip when the dinner duration is minutes.
According to the least-squares principle, the regression line is the one that makes the sum of squared residuals as small as possible. Refer to the dinner duration data in the table below and use the regression equation.
Identify the 10 residuals by comparing the actual tip to the predicted tip for each observation.
According to the least-squares principle, the regression line is the one that makes the sum of squared residuals as small as possible. Using the dinner duration (x) and tip (y) data shown below, along with the regression equation find the sum of the squares of the residuals.
Recall that the residual for each observation is:
A study examines the relationship between engine size (in liters) and fuel efficiency (in miles per gallon) for different car models. A small sample of 6 vehicles provides the following data:
A regression model is used to predict fuel efficiency based on engine size, resulting in a standard error of estimate, se =3.2 mpg. In your own words, describe what that value of se represents.
For the multiple regression equation developed to predict fuel efficiency (miles per gallon, mpg) of cars based on their engine size (liters), vehicle weight (kg), and horsepower, the linear correlation coefficient (r) was found to be -0.803.
(i) What is the value of the coefficient of determination (R2)?
(ii) What practical information does the coefficient of determination give in this context?
A multiple regression model was developed to predict the fuel efficiency (miles per gallon, mpg) of cars based on three factors: engine size (liters), vehicle weight (kg), and horsepower. We get this multiple regression equation:
MPG = 45.6 − 3.12 × Engine Size − 0.015 × Weight + 0.032 × Horsepower
where:
MPG is the fuel efficiency (miles per gallon).
Engine Size is the engine displacement (liters).
Weight is the vehicle's weight (kg).
Horsepower is the engine's power output.
Identify the response and predictor variables.
For the multiple regression equation developed to predict fuel efficiency (miles per gallon, mpg) of cars based on engine size (liters), vehicle weight (kg), and horsepower, the coefficient of determination is found to be:
𝑅2 = 0.785
Question: What does this value of 𝑅2 tell us about the regression model?
A study was conducted to analyze the correlation between engine displacement (liters) and vehicle weight (kg) with fuel efficiency (miles per gallon, mpg). The regression model generated the following results:
ANOVA Table:
Model Fit Summary:
Root MSE = 4.29
R2 = 0.347
Adjusted R2 = 0.332
Should the multiple regression equation be used for predicting fuel efficiency (mpg) based on engine displacement and vehicle weight? Why or why not?
Determine the equation of the regression line using the provided data. Identify a feature of the data that the regression line does not account for.
Refer to the dataset "Student Study Hours vs. Exam Scores" given below:
Determine the regression equation using the pairs of values for all 8 data points.
Refer to the dataset "Employee Experience vs. Salary" given below:
Determine the regression equation using the pairs of values for all 9 data points.
Refer to the dataset "Employee Salary Analysis" given below. The dataset includes employee gender, years of experience, and annual salary. For gender, let 0 represent female and 1 represent male.
i. Using salary as the response variable, determine the multiple regression equation using variable experience and the dummy variable for gender.
Then, use the equation to predict the salary for an employee with the characteristics given below:
Female employee with 10 years of experience
Male employee with 10 years of experience
ii. Does gender appear to have a significant effect on salary?
The following data set represents the population (in thousands) of a city recorded at five-year intervals from 1980 to 2020. Construct a scatterplot and determine which mathematical model best fits the data. Assume that the model is to be used only for the scope of the given data. Consider only linear, quadratic, logarithmic, exponential, or power models.
A group of researchers collected data on taxi fares (in dollars) and the corresponding tips (in dollars) left by passengers. The data were gathered from multiple taxi rides in a city. If the city implements a policy requiring a fixed tip of 15% of the fare, what would be the value of the linear correlation coefficient for the paired taxi fare/tip amounts?