I recently finished reading Nate Silver’s excellent 2012 book titled, The Signal and the Noise:Why So Many Predictions Fail — but Some Don’t. The book is a must read for aspiring statisticians and anyone who wants to make better use of the information constantly bombarding us in this age of Big Data. The central premise of the book is that despite the perceived wealth of available data, big consequential predictions are failing badly. Economists, pundits, and climate scientists are missing the mark. They could learn much from gamblers, chess robots, weathermen, and the author’s pragmatic approaches. As I read along, I tried to find parallels between the case studies and my engineering practice. Certainly a data-rich field like engineering could benefit from better application of statistical analysis. Like the forecasters profiled in the book, I began to perceive that the profession’s use of data is hit and miss. Silver identifies a number of pitfalls to avoid and strategies to employ in order to make better predictions. I’ve identified a few and tagged an engineering parallel to the concept.
Hedgehogs vs. Foxes – The McLaughlin Group panelists make predictions about as accurately as a coin flip. The most outlandish and outspoken, a.k.a the hedgehogs, make for better TV with their occasionally correct bold predictions than the complexity-appreciating fox who issues only qualified opinions. Engineers as a personality type would generally fall into the category of foxes, tending to be cautious and empirical about establishing opinions. However, in any legal fight surrounding the construction industry, it’s not hard to find experts on both sides of the case. Unlike the Law and Order who-dunnits popularized on TV, proportioning guilt in engineering cases can be very hazy. In court before a non-technical jury, who gets the edge – the cautious fox or the confident hedgehog?
Think probabilistically – Surprisingly, weathermen are touted as a rare example of good forecasting in Silver’s book. In recent decades, as meteorological science has improved and computers have gotten faster, weather forecasts have improved. The profession is careful, however, to qualify their prediction of precipitation in probabilistic terms. i.e. there’s a 40% chance of rain tomorrow. If you plot the success rate over time, you’ll generally find that on days with a 40% chance of rain, it actually rains 40% of the time. However, local meteorologists do tend to exaggerate when the chance of rain is low. They’d rather hedge their bet than take blame for a ruined outing. On the other side of the coin, the book notes that earthquake predictions have not improved. Here, engineers take a page from the weatherman’s playbook and think probabilistically about seismic events. The ASCE 7 load code, and by extension most of the building codes in the country, require engineers to design for the maximum credible earthquake with a 2% chance of occurrence in 50 years. We can’t know know exactly when the next big earthquake will hit or how strong it could be, but the profession has agreed on the 2% in 50 standard. This theoretical earthquake is also tuned to place, based on past observations and geological knowledge. By thinking probabilistically, we’re neither paralyzed with indecision nor at risk of grossly underestimating the threat.
Perfect Correlation – Much has been written about the role of mortgage-backed securities and their role in contributing to the 2011 financial meltdown. It was thought that dividing up mortgages and packaging a diverse collection of obligations would reduce risk. Surely the failure of one mortgage in the tranche could not affect the others. Unfortunately, there was correlation between the performance of individual mortgages, both within neighborhoods and around the country. Silver computes that if perfect correlation had been assumed, the risk on the best tranche of mortgage derivatives could be 160,000-times more severe than optimistically assumed. The same considerations may also be applied to structural design. In some circles the term reliability relates to the relative importance of a single member within a structure. A highly reliable structure might still stand if many individual members fail. If you can remember back to your college statics class, you might recall this as a reason that engineers would want to solve the much more analytically complex statically indeterminate structure. Alas, we often design simply supported structures that place a great deal of load on a select few members, i.e. columns. That’s why good seismic design follows the strong-column, weak-beam philosophy. Better that less influential beams fail first and dissipate seismic energy than the columns supporting the entire building.
The book was a trove of best practices for analytical thinking. Here are a few more lessons, for which I have not yet developed suitable allegories to civil engineering. Please use the comment section to suggest your own parallels. Thanks.
- Things not well understood should not just be ignored.
- Overfitting – The Journal of Zoology carried a paper correlating the spawning patterns of toads with a major earthquake in L’Aquila… but just that one time.
- Extrapolation – Models of the spread of infectious disease tend to explode (well not literally). It is a challenge to extrapolate final mortality rates with limit knowledge of boundary conditions.
- Bayesian Thinking – Teaches forecasters to factor in preconceived notions and test hypothesis. It could even help you determine if your spouse is cheating and how to gamble on sports, in addition to widely productive applications.
- Heuristics – In order to compete with a chess playing computer people must apply superior strategy based based on experience based techniques for problem solving. We cannot out-analyze the machine.
- The learning curve – Bad poker players support good ones disproportionately, so stay away from online gambling sites. If you can’t spot the sucker in the first half hour at the table, then you’re the sucker.
- The efficient market – How to beat the market, ps. you can’t.
- Uncertainty – Climate models contain three types of uncertainty: initial condition uncertainty (exponentially declining), scenario uncertainty (linearly increasing), structural uncertainty (constant).
- Known knowns, known unknowns, and unknown unknowns – Rumsfeld re: WMD, pearl harbor, 9/11. Is the unexpected unpredictable?