The Dewey-Beats-Truman results of last night show the current limits of Big Data and Predictive Analytics.
It is an unquestioned maxim that personal data is the new oil of the digital economy. Given enough information about any individual or community, Big Data can guide or predict their behavior or response in given circumstances. This near-universal belief underlies most of current privacy law: from the EU’s Privacy Shield concerns to protecting children from online data collection to concerns about health records, email surveillance and hacking tools.
The theory is that even if it is not possible for now to predict individual human behavior, it is possible to predict aggregate behavior: it may not be possible to predict how any particular person will respond to a home run at Wrigley Field, but it is a safe bet the crowd will erupt.
Insurance and marketing are based on this belief. And datawarehousing and analytics seek to consistently refine their algorithms on the premise that more datapoints can lead to better decision making. Big Data could predict tournament winners before the first player stepped on the field. Big Data could use your writing style and choice of dinner to predict your driving safety record. Privacy advocates wrung their hands over increasing data collection. Given enough data, their fears run, you could predict the baby’s life trajectory from the moment it emerged from the womb to its final eulogy.
That belief crashed against last night, vividly illustrating the limits of Big Data. American presidential campaigns are famously long and brutal – it was said that there was a reason the British “stood” for office and Americans ran for it.
Millions of dollars were spent on every aspect of data analysis, from technology to polling to GOTV to testing. The best minds across America and the world were deployed to a single objective: electing their designated candidate. And eventually all their predictions and evaluations coalesced around a near consensus assessment. Virtually all experts agreed on it. And virtually all were wrong. In the event, the candidate less reliant on Big Data romped home to victory. In a high profile crunch, Big Data has been tried in the balance, and apparently found wanting.
This does not mean that Big Data is dead or doomed. It does reveal that for now, at least, it has its limits. Critics such as Nassim Nicholas Taleb have long warned that a fundamental flaw in these neat data processing models was that their inherent inability to factor in “Black Swans” – the extremely consequential but unexpected development such as the collapse of Lehman Brothers. Put another way, events take unexpected turns because life and people are far more unpredictable than we appreciate.
Last night’s election results are a dramatic reminder that Big Data is far from a perfect predictor because it cannot accommodate this inherent random instability. And since Big Data impacts our lives in everything from insurance rates to job opportunities, privacy law must evolve to account for this inherent imperfection.
Saad Gul and Mike Slipsky, editors of NC Privacy Law Blog, are partners with Poyner Spruill LLP. They advise clients on a wide range of privacy, data security, and cyber liability issues, including risk management plans, regulatory compliance, cloud computing implications, and breach obligations. Saad (@NC_Cyberlaw) may be reached at 919.783.1170 or firstname.lastname@example.org. Mike may be reached at 919.783.2851 or email@example.com.
Physical Address: 301 Fayetteville Street, Suite 1900, Raleigh, NC 27601 | © Poyner Spruill LLP. All rights reserved.