Big data supports a wide variety of analytics activities, but in the business environment it is predominantly interactions with customers where the technology is applied. These applications can be as diverse as estimating the likelihood that a patient is readmitted to hospital, to the likelihood that customers in a supermarket might buy milk when they buy bread. The most mature environment for ‘big data’ is to be found in the financial services industries, where activities such as fraud detection, customer defection, and delinquency have benefited from predictive models and business rules management for many years.
The swelling volumes of big data are mostly customer focused. The data generated within the business does not grow so quickly, it does after all require people to create it. However data that is created by the customers themselves is exploding in volume – social data, web site data, data from devices, and various other mechanisms for collecting low cost, diverse data from customers. While we tend to focus on data volumes, and the term big data encourages this focus, it is the diversity of data that tends to add most value in customer focused big analytics. This means we have to embrace data other than the familiar record oriented data held in transaction databases. Text data which might contain customer feedback, interviews with call center operatives, social data, comments on web sites, and other sources, adds a dimension that cannot be gained from traditional sources. Text analytics can be used to assess sentiment so that early detection of defection, fraud and other behaviors can be spotted with more accuracy.
With the emerging Internet of Things (IoT) we are increasingly able to process very high volume streams of data, which often need a real-time response. To this end stream processing is becoming a major form of big analytics. Typical applications include responding to customers in real-time as they move around a web site on their mobile (or other) devices. This might also include the use of location data – another data type that is well accommodated by big data technologies and analytical methods. As the scope of the IoT broadens, many businesses will need to apply analytics to real-time data streams, and it is here particularly that a mature decision management environment will be absolutely essential. Real-time is much less accommodating of mistakes and errors, unlike batch processing, which is the dominant use of big data technologies at the present time.
Increasing numbers of decision automating, predictive models will be applied to real-time scenarios. Optimizing decisions in the ‘now’ will increasingly become the determiner of success or failure. The applications range from monitoring complex processing plant, where tens of thousands of sensors might be installed, through to real-time processing of credit card usage to detect fraud. All industries will be affected – healthcare, manufacturing, telecoms, financial services, retail and most others.
And so the complexity and scope of big analytics will grow considerably over the coming decade. We have seen many, many times that complexity is the greatest threat to the successful use of information technologies. Big analytics cannot be treated as a boutique application that bolts on to existing applications. It needs its own methods, infrastructure and tools, and we predict that it will very soon become the tail that is wagging the dog. To this end it cannot be stressed enough that an extensive, mature, decision management infrastructure needs to be installed so that decision models can be monitored, modified, developed, verified and deployed in a manner that is safe, and where management retain control of the automated decisions that big analytics enable.
The previous article in this series is Big Analytics Methods