Every time the human race has been faced with an explosion of information and data, so a period of decline has followed. The introduction of the printing press ushered in a couple of hundred years of religious wars, because of the widespread availability of books and documents that promoted this or that religious belief. This explosion of information was greeted with the behavior that always accompanies such events – people reinforced their biases and became more defensive and aggressive (does it sound familiar?). When computing power became much more widely available in the 1970s we saw a decade and a half of economic decline. Computing power was used to support more and more ridiculous models that might predict the future – from earthquakes to economic performance. They all failed miserably, and although trashing of the worldwide currency exchange regime by President Nixon didn’t help, there is good reason to believe that the unskilled use of computers was a contributing factor. And so we come to today, when data is everywhere and we really haven’t got a clue what to do with it. We’ve labelled it: it’s Big Data. But this really doesn’t help. In fact what we are doing with data is very reminiscent of what we did with computers in the 1970s. We took what we had been doing before (building models with lots of equations) and did more of it, although now we take a petabyte instead of a gigabyte and apply the same regime.
The problems of big data are not only not solved, they are not understood. To illustrate the point, I recently saw an interview with a member of the data science team working for a large UK insurance company. She proudly boasted that the predictive algorithms they were using called upon over 400 variables (or attributes). Algorithms that look for patterns generally create combinations of these variables, and the number of combinations is a power law – 2n to be precise. With 400 variables the number of combinations is 2400 – more than the number of atoms in the universe. This allows the algorithm to find all sorts of nonsense and present it as meaningful. In other words, finding patterns in random data. Even with exhaustive validation and testing methods, some of this nonsense will creep through. This is before we consider the fact that human beings always try to reinforce their biases, and will structure analysis to try and achieve this – quite unconsciously. So this insurance company will be using patterns that are nothing more than randomness dressed up to look respectable – and this will be causing harm – or destroying value. I’m not even going to deal with the current fad for visual analytics, otherwise known as finding patterns in random noise – it would be too wearying for all of us.
In the long term we will discover how to use large amounts of diverse data in a productive way. For now though it’s fraught with danger. And another thing we can be sure of is that big data will deliver advantages not currently imagined, and that the approach which simply does more of what we did before will be seen to have been highly damaging. Since much of big data is concerned with trying to get customers to spend more, it is highly likely that this will prove to be unproductive and even destructive. The escalating asymmetry in wealth distribution doesn’t help here (fewer people can afford to buy stuff), and we are all becoming just a tad weary of targeted promotions and adverts.
Don’t expect change any time soon, the big data bandwagon just has too much invested in it. Consultants, business managers and technicians all want big data on their resumés, and the technology suppliers are hardly likely to ask us to pause for thought. Business and technology magazines take endless Big Data advertising, and we all want to read about the next technology fad. So it’s good business for everyone, except perhaps the businesses that are using the technology.
Finally, just to prove that I really am being the skunk at the big data party, it might be worth considering the words of Gerd Gigerenzer, the author of Risk Savvy, and Director at the Max Planck Institute. He provides plenty of evidence that our gut instincts often outperform rigorous analysis – not least because the rigorous analysis is just plain wrong. So don’t be too keen to throw the baby out with the bathwater, and at least consider your gut instincts, giving evidence based decisioning a little less credibility than the pompous name might suggest.