Thursday, December 18, 2008

Benford's Law Catches Madoff (error!)

Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 almost one third of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than one time in twenty. Hal Varian noted this could be used to detect accounting fraud way back in 1972. Here are the proportions of the leading digit according to Benford's law compared to Madoff's numbers (sample size, 171):

lead Benfor st.err. Madoff
1 30.10% 3.43% 40.46%
2 17.60% 2.85% 13.87%
3 12.50% 2.47% 8.67%
4 9.70% 2.21% 7.51%
5 7.90% 2.02% 5.78%
6 6.70% 1.87% 5.78%
7 5.80% 1.75% 4.62%
8 5.10% 1.64% 6.94%
9 4.60% 1.57% 5.20%

Earlier I reported there were errors in digits 2 and 3. Someone noted Paul Kedrosky did a similar analysis and got different results, so I redid the analysis, and got different results! My bad. I do note that the number of 1's is now outside of 2 stds. Out of 173 data points (12 years plus 5 months minus two zeros), I have 70 observations with leading 1's. But, if the mean return is really 0.96 with minimal vol, one could say this naturally violates Benford's law.

I wasn't even drinking last night.

9 comments:

jsalvati said...

Why isn't this sort of thing automated? Why don't investment banks and the SEC regularly do this sort of thing for all the data they have?

zarkov01 said...

The SEC could automate the law, but eventually the cheaters would catch on and fake the frequency of their bogus digits. Benford's law can be extended to pairs of digits and higher orders to give even more tests. I once proposed such an approach for faked scientific cheating and I got resistance even from scientists.

The problem with selling this approach is that it seems too esoteric. The SEC consists mostly of lawyers who a notoriously non-quantitative.

Brent Buckner said...

Is that such strong evidence?

Naively, you have a bunch of draws, so the odds aren't terrible than one will be an outlier. The draws are related, so once you have an outlier, the odds of another outlier are higher than in an independent draws.

J said...

One with time on his hand could write a Benford computer program log 10 (1+1/n) and run it on other fund results (but it could never catch numbers people like James Simons.)

Anonymous said...

don't think benford's law could be used to prosecute ppl. waiting on taleb to crush it anyway.

Anonymous said...

You really should stretch a bit and master HTML tables for presenting data like this.

Anonymous said...

Where did you get the data from? Paul Kedrosky has an analysis saying that this isn't the case.

http://paul.kedrosky.com/archives/2008/12/19/bernie_vs_benfo.html

He used 196 data points, you used 179.

Anonymous said...

I performed the same analysis and calculated that there was a less than .002 probability that 1 would appear as often as it did in the Sentry data series.

Also Paul Kedrosky claims there are 196 observations in the set,. That's not possible as there are 174 months of returns data. Do you know what gives?

Anonymous said...

Supposedly the gang at Enron knew very well about Benford's Law and managed their in-and-out transactions so they would not trigger any Benford filters.