Many data mining tasks can be accomplished within Excel, given a suitable add-in. The main benefit is that this is a familiar environment and is ideally suited to trying things out. The five data mining add-ins listed here differ considerably in their sophistication and user friendliness. 11Ants Model Builder hides as much of the back room activity as possible and will automatically select the most appropriate mining methods. Alyuda ForecasterXL however offers self tuning neural networks as a method of mining data. DataMinerXL is a tool for people familiar with data mining techniques and Predixion Enterprise Insight is the only solution that many organisations might need. Finally XLMiner provides a full data mining environment for people with the relevant knowledge. All these tools can be used for predictive analytics where discovered patterns are used to score new data.
It should be added that if your organization uses Microsoft SQL Server, and specifically SQL Server Analysis Services (SSAS), then an add-in is available for Excel called the Data Mining Add-in which supports the creation of data mining models with Excel – it’s very powerful too.
11Ants Model Builder from 11Ants Analytics
This is a user friendly Microsoft Excel add-on that can be used with a minimum of training and will quickly identify predictive patterns in data. Most of the action is behind the scenes and the software will automatically home in on the most productive data mining methods. In larger organisations these models can be deployed in enterprise databases using 11Ants Predictor. This supports very high throughput scoring on Oracle, Microsoft SQL Server and Teradata databases.
11Ants Model Builder supports decision tree, Gaussian processes, logistic regression, Naïve Bayes, nearest neighbour, random forest and support vector machine – amongst others.
Marketing solutions are offered for customer churn and customer response predictive analysis. Again the primary model development is accomplished in a Microsoft Excel environment, and models can then be deployed to enterprise databases using 11Ants Predictor.
This Excel add-in implements neural networks within Excel. It boasts ease-of-use with automatic neural network parameter and architecture selection. Various graphical and analytical displays are provided and the partition of data into training and test sets is straightforward.
Moderately priced at US$197 for a single user (US$997 for unlimited site), it is a low cost method of exploring the use of neural networks within an Excel setting.
If you have some familiarity with data mining techniques and want a low risk route then DataMinerXL is a good option. This is an Excel add-in which supports the creation of predictive models using a wide variety of technqiues, including regression (linear and logistic), naive Bayes, decision trees, neural networks, support vector machines (SVM) and will even solve linear, quadratic and linear complementarity problems. Other functions are also included for those with a math bent (numerical integration, and matrix manipulation). Basic statistical functions are also included.
Clearly this is not an end-user tool. But for someone familiar with the territory it is an excellent way to build predictive models, and for all the budding information scientists out there a free version (throttled to 1000 instances) is available. The paid licence is very reasonable too at US$ 499 per year.
An excellent book has been published by the creators of DataMinerXL – Foundations of Predictive Analytics by James Wu and Stephen Coggeshall.
The Excel front end is the client side to a broader data mining capability. The server side supports most data and database products including big data sources such as Hadoop and Greenplum. Collaborative capability is one of the main features of the product with full integration into the Microsoft stack. A level of end-user capability is claimed and models can be shared using SharePoint dashboards.
Predixion Enterprise Insight Developer Edition can be downloaded for free (you won’t even have to enter your details), and can be used to get a feel for the technology prior to commitment.
This add-in for Excel provides a full-blown data mining capability with data preparation tools, support for times series analysis and visualisation tools. The techniques used by the add-in include regression (logistic and linear), Bayes classifier, association rules, neural nets, classification and regression trees, clustering, principal components and discriminant analysis.
The data sources supported include Microsoft’s PowerPivot, Microsoft/IBM/Oracle databases and of course simple spreadsheets.