Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercise?, ?test for goodness-offit with Benford’s law. L e a d i n g D i g i t B e n f o r d ’ s L a w : D i s t r i b u t i o n o f L e a d i n g D i g i t s Author’s Check Amounts? Exercise 1 lists the observed frequencies of leading digits from amounts on checks from seven suspect companies. Here are the observed frequencies of the leading digits from the amounts on checks written by the author: 68, 40, 18, 19, 8, 20, 6, 9, 12. (Those observed frequencies correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively.) Using a 0.05 significance level, test the claim that these leading digits are from a population of leading digits that conform to Benford’s law. Do the author’s check amounts appear to be legitimate? Exercise 1 Detecting Fraud? When working for the Brooklyn district attorney, investigator Robert Burton analyzed the leading digits of the amounts from 784 checks issued by seven suspect companies. The frequencies were found to be 0, 15, 0, 76, 479, 183, 8, 23, and 0, and those digits correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively. If the observed frequencies are substantially different from the frequencies expected with Benford’s law, the check amounts appear to result from fraud. Use a 0.01 significance level to test for goodness-of-fit with Benford’s law. Does it appear that the checks are the result of fraud?

Solution 22BSC Step 1 Here are the observed frequencies of the leading digits from the amounts on checks written by the author: 68, 40, 18, 19, 8, 20, 6, 9, 12. (Those observed frequencies correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively.) Using a 0.05 significance level, test the claim that these leading digits are from a population of leading digits that conform to Benford’s law. Do the author’s check amounts appear to be legitimate The Hypotheses can be expressed as H0 p 1 0.301, p = 0.276, p = 0.1253 p = 0.097, p 4 0.079, p = 0.569, p = 0.058,6p = 7 8 0.051, p = 0.051, p = 0.046 9 10 H1 At least one of the proportion is not equal to its claimed value. The Test Statistic here is (O i E i2 = E i Where O = Observed frequency i Ei= Expected frequency k = number of different categories of outcome n = the total number observed sample value. Now, we have 9 outcomes total of 200, the expected frequency can be calculated by E = np Observed Expected Leading frequency Frequency 2 (Oi i ) (O i E )i (O i E i digits (O) Ei E = np 1 68 60.2 7.8 60.84 1.0106 2 40 35.2 4.8 23.04 0.6545 3 18 25 -7 49 1.96 -0. 4 19 19.4 0.16 0.0082 4 -7. 5 8 15.8 60.84 3.8506 8 6 20 13.4 6.6 43.56 3.2507 -5. 7 6 11.6 31.56 2.7034 6 -1. 8 9 10.2 1.44 0.1411 2 9 12 9.2 2.8 7.84 0.8521 2 (Oi Ei) = Ei = 14.4316