- www.nytimes.com/.../nate-silvers-signal-and-the-noise-examin...23 Oct 2012 – The Signal and the Noise,” by the statistician and blogger Nate Silver, examines the complex and often rudimentary science of prediction.
Mining Truth From Data Babel
Nate Silver’s ‘Signal and the Noise’ Examines Predictions
By LEONARD MLODINOW
Published: October 23, 2012A friend who was a pioneer in the computer games business used to marvel at how her company handled its projections of costs and revenue. “We performed exhaustive calculations, analyses and revisions,” she would tell me. “And we somehow always ended with numbers that justified our hiring the people and producing the games we had wanted to all along.” Those forecasts rarely proved accurate, but as long as the games were reasonably profitable, she said, you’d keep your job and get to create more unfounded projections for the next endeavor.
Alessandra Montalto/The New York TimesTHE SIGNAL AND THE NOISEWhy So Many Predictions Fail — but Some Don’tBy Nate SilverIllustrated. 534 pages. The Penguin Press. $27.95.
Robert GauldinThis doesn’t seem like any way to run a business — or a country. Yet, as Nate Silver, a blogger for The New York Times, points out in his book, “The Signal and the Noise,” studies show that from the stock pickers on Wall Street to the political pundits on our news channels, predictions offered with great certainty and voluminous justification prove, when evaluated later, to have had no predictive power at all. They are the equivalent of monkeys tossing darts.As one who has both taught and written about such phenomena, I have long felt like leaning out my window to shout, “Network”-style, “I’m as mad as hell and I’m not going to take this anymore!” Judging by Mr. Silver’s lively prose — from energetic to outraged — I think he feels the same way.The book’s title comes from electrical engineering, where a signal is something that conveys information, while noise is an unwanted, unmeaningful or random addition to the signal. Problems arise when the noise is as strong as, or stronger than, the signal. How do you recognize which is which?Today the data we have available to make predictions has grown almost unimaginably large: it represents 2.5 quintillion bytes of data each day, Mr. Silver tells us, enough zeros and ones to fill a billion books of 10 million pages each. Our ability to tease the signal from the noise has not grown nearly as fast. As a result, we have plenty of data but lack the ability to extract truth from it and to build models that accurately predict the future that data portends.Mr. Silver, just 34, is an expert at finding signal in noise. He is modest about his accomplishments, but he achieved a high profile when he created a brilliant and innovative computer program for forecasting the performance of baseball players, and later a system for predicting the outcome of political races. His political work had such success in the 2008 presidential election that it brought him extensive media coverage as well as a home at The Times for his blog, FiveThiryEight.com, though some conservatives have been critical of his methods during this election cycle.His knack wasn’t lost on book publishers, who, as he puts it, approached him “to capitalize on the success of books such as ‘Moneyball’ and ‘Freakonomics.’ ” Publishers are notorious for pronouncing that Book A will sell just a thousand copies, while Book B will sell a million, and then proving to have gotten everything right except for which was A and which was B. In this case, to judge by early sales, they forecast Mr. Silver’s potential correctly, and to judge by the friendly tone of the book, it couldn’t have happened to a nicer guy.Healthily peppered throughout the book are answers to its subtitle, “Why So Many Predictions Fail — but Some Don’t”: we are fooled into thinking that random patterns are meaningful; we build models that are far more sensitive to our initial assumptions than we realize; we make approximations that are cruder than we realize; we focus on what is easiest to measure rather than on what is important; we are overconfident; we build models that rely too heavily on statistics, without enough theoretical understanding; and we unconsciously let biases based on expectation or self-interest affect our analysis.Regarding why models do succeed, Mr. Silver provides just bits of advice (other than to avoid the failings listed above). Mostly he stresses an approach to statistics named after the British mathematician Thomas Bayes, who created a theory of how to adjust a subjective degree of belief rationally when new evidence presents itself.Suppose that after reading a review, you initially believe that there is a 75 percent chance that you will like a certain book. Then, in a bookstore, you read the book’s first 10 pages. What, then, are the chances that you will like the book, given the additional information that you liked (or did not like) what you read? Bayes’s theory tells you how to update your initial guess in light of that new data. This may sound like an exercise that only a character in “The Big Bang Theory” would engage in, but neuroscientists have found that, on an unconscious level, our brains do naturally use Bayesian prediction.Mr. Silver illustrates his dos and don’ts through a series of interesting essays that examine how predictions are made in fields including chess, baseball, weather forecasting, earthquake analysis and politics. A chapter on poker reveals a strange world in which a small number of inept but big-spending “fish” feed a much larger community of highly skilled sharks competing to make their living off the fish; a chapter on global warming is one of the most objective and honest analyses I’ve seen. (Mr. Silver concludes that the greenhouse effect almost certainly exists and will be exacerbated by man-made CO2 emissions.)So with all this going for the book, as my mother would say, what’s not to like?The main problem emerges immediately, in the introduction, where I found my innately Bayesian brain wondering: Where is this going? The same question came to mind in later essays: I wondered how what I was reading related to the larger thesis. At times Mr. Silver reports in depth on a topic of lesser importance, or he skates over an important topic only to return to it in a later chapter, where it is again discussed only briefly.As a result, I found myself losing the signal for the noise. Fortunately, you will not be tested on whether you have properly grasped the signal, and even the noise makes for a good read.
- online.wsj.com/.../SB10000872396390444554704577... - Cached24 Sep 2012 – Burton Malkiel reviews The Signal and the Noise: Why So Many Predictions Fail
—But Some Don't by Nate Silver.
Telling Lies From Statistics
Forecasters must avoid overconfidence—and recognize the degree of uncertainty that attends even the most careful predictions.
Mr. Silver reminds us that we live in an era of "Big Data," with "2.5 quintillion bytes" generated each day. But he strongly disagrees with the view that the sheer volume of data will make predicting easier. "Numbers don't speak for themselves," he notes. In fact, we imbue numbers with meaning, depending on our approach. We often find patterns that are simply random noise, and many of our predictions fail: "Unless we become aware of the biases we introduce, the returns to additional information may be minimal—or diminishing." The trick is to extract the correct signal from the noisy data. "The signal is the truth," Mr. Silver writes. "The noise is the distraction."
The first half of Mr. Silver's analysis looks closely at the success and failure of predictions in clusters of fields ranging from baseball to politics, poker to chess, epidemiology to stock markets, and hurricanes to earthquakes. We do well, for example, with weather forecasts and political predictions but very badly with earthquakes. Part of the problem is that earthquakes, unlike hurricanes, often occur without warning. Half of major earthquakes are preceded by no discernible foreshocks, and periods of increased seismic activity often never result in a major tremor—a classic example of "noise." Mr. Silver observes that we can make helpful forecasts of future performance of baseball's position players—relying principally on "on-base percentage" and "wins above replacement player"—but we completely missed the 2008 financial crisis. And we have made egregious errors in predicting the spread of infectious diseases such as the flu.
In the second half of his analysis, Mr. Silver suggests a number of methods by which we can improve our ability. The key, for him, is less a particular mathematical model than a temperament or "framing" idea. First, he says, it is important to avoid overconfidence, to recognize the degree of uncertainty that attends even the most careful forecasts. The best forecasts don't contain specific numerical expectations but define the future in terms of ranges (the hurricane should pass somewhere between Tampa and 350 miles west) and probabilities (there is a 70% chance of rain this evening).
The Signal and the NoiseBy Nate Silver
(The Penguin Press, 534 pages, $27.95)
This example and many others are neatly presented in "The Signal and the Noise." Mr. Silver's breezy style makes even the most difficult statistical material accessible. What is more, his arguments and examples are painstakingly researched—the book has 56 pages of densely printed footnotes. That is not to say that one must always agree with Mr. Silver's conclusions, however.
As someone interested in financial markets, I found myself unconvinced by Mr. Silver's view that it should not be "all that challenging" to identify financial bubbles "before they burst." He suggests that the dot-com bubble that deflated in early 2000 was identifiable in advance. The price-earnings multiple for the market was enormously elevated at 44. Considerable empirical work, shown in the book, was adduced to point out that long-run (10- or 20-year) rates of return from stocks have generally been poor or negative when investors entered the market at such lofty valuation metrics.
The problem is that Mr. Silver has ignored all the false positives. Earnings multiples were elevated in the early 1990s, suggesting poor stock returns. But the 1990s produced extraordinarily generous equity returns. Earnings multiples were even higher in December 1996, suggesting negative long-run rates of return. This analysis influenced Alan Greenspan's famous "irrational exuberance" speech that month. The stock market rallied sharply until March 2000. Yes, the valuation model gave an accurate bubble prediction in March 2000 but a devastatingly inaccurate one throughout much of the 1990s. Stock prices were wildly inflated in early 2000. But the efficient-market hypothesis doesn't imply that prices are always correct, as Mr. Silver asserts. Prices are always wrong. What the hypothesis asserts is that one never knows for sure if they are too high or too low.
Mr. Malkiel is the author of "A Random Walk Down Wall Street."
www.npr.org/.../the-signal-and-the-noise-why-most-prediction...A description for this result is not available because of this site's robots.txt – learn more.