A WSJ article notes that people are intrigued by those peculiar people who have HIV, but not AIDS. In cancer research, five Nobel prizes have been won by researchers who studied tumor viruses (3 for a chicken virus! the Rous virus). This started in 1911 when Peyton Rous discovered a virus that was found to cause tumors in chickens. The hope is that these special cases highlights the essence of the puzzle, and being a virus, leads to an easy isolation. Unfortunately, this thread has not proven very successful. Viral cancers are rare and not very relevant for human cancer, which makes sense when you consider cancer is not contagious. With a few exceptions cancer survival rates remain much of what they were in the 1950s. The Mayo Clinic reports that for cancers diagnosed from 1974 to 1976, the five-year survival rate was 50 percent, now 65 percent. This is mainly because of early detection, as opposed to any great new drugs.
A common idea is that outliers are more important than averages, as this is the them of Taleb's Black Swan, or Gladwell's Outliers. Predicting one great outlier is worth predicting many ordinary outcomes, so on one hand it seems like an optimal focus. Also, the outliers should highlight the essence of something. A stock that has risen 10 fold, or a great athlete, supposedly lays bare the essence of its greatness.
But I think we forget how biased our view is on exceptional events and people. We watch sports and learn about Usain Bolt, a most unusual man. Or my kids read the Guiness Book of World Records, containing stories about 1200 lb men and giant frogs. News is biased towards the exceptional, it takes no effort to emphasize it. In fact, it takes effort to see the ordinary. It's too bad people think of heroes as those who, for a brief moment, offered their life in some battle or harrowing situation, compared to the much more common heroism of providing one's family, not complaining, and being charitable to friends and neighbors, for decades.
A problem is that for any extremum is caused by its unique intrinsic characteristics and also random chance. We hope that we can more clearly see the intrinsic qualities associated with, say, a rising stock, by looking at sthose stocks that went up 10-fold last year. Yet, any extremum is probably a large random error. We simply can't predict the future very well, say an R2 of only 10%. Thus, a 100% stock return contains, on average, a 90% error. Yet, conditional up rising this much, this is only the mean error, its standard error is at least as large, meaning, any singular 100% return is probably all randomness. In this case, more analysis is worthless, like trying to explain a lottery number. The explanatory characteristics are irrelevant for these cases.
I remember when I worked at Moody's working on default models. The CFA types liked to do case studies of famous financial catastrophes, and worked through all the sub-accounts of say, Boston Chicken, or Enron. Unfortunately, these were really poor archetypes, and we could found that complimenting these analyses with our default model made both look less relevant. The cases that made for a great narrative usually involved a good degree of fraud, which is difficult to detect in real time. In the end, the group that sold training seminars on credit analysis continued their program of exceptional case studies, our models simply grinded out boring probabilities independent of these examples, and they remain independent areas of interest.
I find it much more fruitful to look at averages, mainly between groupings of interesting explanatory variables, and their correlation with the desiderata--not the reverse. That is, instead of looking at the top 10 stocks from last year, look at things like the top decile of p/e ratios, which predict stock returns at a much more modest level. We know these things are inversely correlated with returns (the value effect), so then the question is, how can I take advantage of this? If I combine this with cashflow/assets, or momentum, how does this work? Sure, it may at best add a couple percent to your annualized return, but at least its feasible. Needless to say, the current focus leads to an excess focus on highly volatile stocks, which is why I argue that these risky stocks have lower returns on average than their more boring counterparts.
Extremums can be informative, but they tend to dominate our information set anyway because they lend themselves to interesting narratives. Sure it would be best to know the big events if you had a time machine, but living in the present, if you really want to predict, focus on those things that are potentially predictive, which generally means looking at how averages relate to averages, as opposed to how outliers related to averages. The latter is mainly selection bias and random error.