Skip to main content
Ch. 11 - Goodness-of-Fit and Contingency Tables
Triola - Elementary Statistics 14th Edition
Triola14th EditionElementary StatisticsISBN: 9780137366446Not the one you use?Change textbook
Chapter 11, Problem 11.1.21

Benford’s Law
According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.





Detecting Fraud When working for the Brooklyn district attorney, investigator Robert Burton analyzed the leading digits of the amounts from 784 checks issued by seven suspect companies. The frequencies were found to be 0, 15, 0, 76, 479, 183, 8, 23, and 0, and those digits correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively. If the observed frequencies are substantially different from the frequencies expected with Benford’s law, the check amounts appear to result from fraud. Use a 0.01 significance level to test for goodness-of-fit with Benford’s law. Does it appear that the checks are the result of fraud?

Verified step by step guidance
1
Step 1: State the null and alternative hypotheses. The null hypothesis (H0) is that the observed frequencies of the leading digits follow Benford's Law distribution. The alternative hypothesis (H1) is that the observed frequencies do not follow Benford's Law distribution.
Step 2: Calculate the expected frequencies for each leading digit based on Benford's Law. Multiply the total number of checks (784) by the percentages given in the table for each leading digit. For example, the expected frequency for the leading digit '1' is 784 × 0.301.
Step 3: Use the chi-square goodness-of-fit test formula: χ² = Σ((Oᵢ - Eᵢ)² / Eᵢ), where Oᵢ represents the observed frequency and Eᵢ represents the expected frequency for each leading digit. Compute this value for all leading digits (1 through 9).
Step 4: Determine the degrees of freedom (df) for the chi-square test. The degrees of freedom are calculated as (number of categories - 1). In this case, there are 9 leading digits, so df = 9 - 1 = 8.
Step 5: Compare the calculated chi-square statistic to the critical value from the chi-square distribution table at a 0.01 significance level and 8 degrees of freedom. If the calculated value exceeds the critical value, reject the null hypothesis and conclude that the checks are likely the result of fraud. Otherwise, fail to reject the null hypothesis.

Verified video answer for a similar problem:

This video solution was recommended by our tutors as helpful for the problem above.
Video duration:
5m
Was this helpful?

Key Concepts

Here are the essential concepts you must grasp in order to answer the question correctly.

Benford's Law

Benford's Law states that in many naturally occurring datasets, the leading digits are not uniformly distributed. Instead, smaller digits appear more frequently as the leading digit. For example, the digit '1' appears as the leading digit about 30.1% of the time, while larger digits like '9' appear only about 4.6% of the time. This phenomenon is used in various fields, including fraud detection, to identify anomalies in data.
Recommended video:
Guided course
05:24
Goodness of Fit Test: Unequal Probabilities

Goodness-of-Fit Test

A goodness-of-fit test is a statistical hypothesis test used to determine how well a set of observed data matches a specific distribution. In this context, it assesses whether the observed frequencies of leading digits from the checks align with the expected frequencies according to Benford's Law. The Chi-square test is commonly used for this purpose, comparing the observed counts to the expected counts to evaluate the fit.
Recommended video:
Guided course
10:17
Goodness of Fit Test

Significance Level

The significance level, often denoted as alpha (α), is the threshold used to determine whether to reject the null hypothesis in hypothesis testing. In this case, a significance level of 0.01 indicates a 1% risk of concluding that a difference exists when there is none. If the p-value from the goodness-of-fit test is less than 0.01, it suggests that the observed data significantly deviates from what Benford's Law would predict, potentially indicating fraudulent activity.
Recommended video:
03:33
Finding Binomial Probabilities Using TI-84 Example 1
Related Practice
Textbook Question

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and/or critical value, and state the conclusion.


Heights Measured or Reported? Repeat the preceding exercise using the frequencies in the following table, which summarizes all of the 2784 male heights listed in Data Set 4 “Measured and Reported.” Does the larger data set have much of an effect on the results from Exercise 5?

" style="" width="600">

149
views
Textbook Question

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and/or critical value, and state the conclusion.


Bias in Clinical Trials? Researchers investigated the issue of race and equality of access to clinical trials. The following table shows the population distribution and the numbers of participants in clinical trials involving lung cancer (based on data from “Participation in Cancer Clinical Trials,” by Murthy, Krumholz, and Gross, Journal of the American Medical Association, Vol. 291, No. 22). Use a 0.01 significance level to test the claim that the distribution of clinical trial participants fits well with the population distribution. Is there a race/ethnic group that appears to be very underrepresented?


112
views
Textbook Question

Dogs Detecting Malaria The following table lists results from an experiment designed to test the ability of dogs to use their extraordinary sense of smell to detect malaria in samples of children’s socks (based on data presented at an annual meeting of the American Society of Tropical Medicine, by principal investigator Steve Lindsay). Assuming that the dog being correct is independent of whether malaria is present, find the expected value for the observed frequency of 123.


145
views
Textbook Question

Equivalent Tests A x^2 test involving a 2 x 2 table is equivalent to the test for the difference between two proportions, as described in Section 9-1. Using Table 11-1 from the Chapter Problem, verify that the x^2 test statistic and the z test statistic (found from the test of equality of two proportions) are related as follows: z^2 = x^2 Also show that the critical values have that same relationship.

157
views
Textbook Question

Clinical Trial of Echinacea In a clinical trial of the effectiveness of echinacea for preventing colds, the results in the table below were obtained (based on data from “An Evaluation of Echinacea Angustifolia in Experimental Rhinovirus Infections,” by Turner et al., New England Journal of Medicine, Vol. 353, No. 4). Use a 0.05 significance level to test the claim that getting a cold is independent of the treatment group. What do the results suggest about the effectiveness of echinacea as a prevention against colds?

162
views
Textbook Question

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and/or critical value, and state the conclusion.


Flat Tire and Missed Class A classic story involves four carpooling students who missed a test and gave as an excuse a flat tire. On the makeup test, the instructor asked the students to identify the particular tire that went flat. If they really didn’t have a flat tire, would they be able to identify the same tire? The author asked 41 other students to identify the tire they would select. The results are listed in the following table (except for one student who selected the spare). Use a 0.05 significance level to test the author’s claim that the results fit a uniform distribution. What does the result suggest about the likelihood of four students identifying the same tire when they really didn’t have a flat?


" style="" width="600">

109
views