Understanding Data Mining and Business Intelligence
To get started with this we need to define these two terms. Data mining is the act of trawling through historical data with the aim of finding patterns that might be useful in the future. Business intelligence is concerned with looking at historical and current data to diagnose and describe. Diagnosis for example might mean drilling down into last month’s sales data to see why certain products didn’t perform as expected. Description on the other hand is simply a report on status, such as can be seen on a key performance indicator (KPI) dashboard. And so business intelligence is largely concerned with looking in the rear-view mirror, whereas data mining is used to predict future behaviors.
The Holy Grail is to unite data mining and business intelligence into one. There are many applications. A sales rep might look at the order history of a customer – this is business intelligence. But she or he might also want to look at the likelihood of the customer buying a new product offering – this would come from data mining, and specifically trying to predict customers for a product based on who has purchased it previously.
Data Mining
Data mining takes several forms. Classification is used to classify a variable we are interested in. In the above example we are interested in the interest a customer may show in a product – the classes would be ‘interested’ or ‘not interested’. We might want to predict the best bundle of products to sell to a customer (as Amazon does) – this would also come from data mining, identifying the profiles of customers who have purchased a given bundle before.
The next type of data mining is called regression. This is when we try to predict a number – instead of a category, as in classification. And so if we give customers credit we may want to predict the best credit limit to give a customer – an amount specified as a number. As before we mine our databases to establish customer profiles and the credit limits that have been successful.
Clustering is the next type of data mining, and here we just let the data mining algorithms loose to find groups that share common properties. This is often used in marketing when deciding who to target in a new promotion. This saves money, because fewer people are targeted, reduced the nuisance factor, and increases lift – the percentage of people who respond positively.
Finally there is association mining where we look for things that are associated with each other. The classic application of this is market basket analysis, where we look for items that customers tend to purchase together. If people usually buy bread when they buy milk, it might make sense to place these items near to each other.
Business Intelligence
The tools of business intelligence have traditionally been the data warehouse and/or online analytical processing (OLAP) databases. These are used to make reporting and querying historical data easier and more flexible. However there is now a new generation of business intelligence tools that use highly compressed in-memory databases. A modestly powered laptop can now perform the sort of analysis that once needed a large data warehouse.
The output from business intelligence has traditionally been reports of various sorts, but self-service business intelligence means users can create their own reports, charts and dashboards with ease. Products such as Tableau, Spotfire and Qlik Sense are typical of this new technology.
Combining Data Mining and Business Intelligence
Combining data mining and business intelligence means adding new types of information to business dashboards and reports. Data mining creates predictive models – models which predict customer behavior for example. If a report listing customer purchases could also include a new field which showed the best next product to offer a customer, this would be of immense value. To achieve this the business intelligence application has to be able to talk to the predictive model. Various mechanisms are available, one of the most common being the use of an application programming interface (API). However some business intelligence platforms now embrace data mining technology and allow users to combine the two. Tableau for example allows skilled users to data mine using the R language. Spotfire goes one step further with inbuilt, easy to use data mining tools, and these can be used to build predictive models using data mining, and then include them in reports and dashboards.
Eventually all business intelligence platforms will embrace data mining and the creation of predictive models. The other alternative of course is to buy the predictive models ready-made and interface them with the business intelligence applications.
The two business intelligence vendors who are particularly advanced in this respect are Alteryx, and Spotfire from TIBCO. Other will follow.