Not surprisingly Oracle provides a a full repertoire of technologies to handle data mining and statistical analysis, with or without big data. This is mostly put under the Oracle Advanced Analytics umbrella, encompassing predictive analytics, text mining, statistical analysis, data mining, mathematical computation and visualisation. Key to the approach taken by Oracle is the notion of in-database processing – they were originally a database company after all. This means that processing happens in the database environment and that data extraction is unnecessary. However it is unlikely that any organisation would want to run its data mining activities in the same environment as the transactional systems, and so there is an implication at least that data is extracted to a data warehouse, or other environment.
Oracle provides two routes to analysis:
- Oracle Data Mining revolves around SQL and the actual modelling environment, Oracle Data Miner, comes as an extension to Oracle SQL Developer. It supports most of the usual data mining algorithms with the exception of neural networks – oddly enough. Support vector machines feature strongly, and the range of algorithms is adequate but not as extensive as some other offerings.
- Oracle R Enterprise extends the database with a library of R functions and makes database tables and views available to the R environment as native R objects. Oracle positions this as addressing statistical analysis, but in reality R encompasses many data mining algorithms – more than Oracle Data Mining.
For organisations wanting to move into big data Oracle provides two hardware/software solutions:
- Oracle Big Data Appliance is a relatively low cost platform for running big data software, and specifically the Cloudera distribution of Hadoop and Oracle NoSQL Database Community Edition – take your pick or use both.
- The Exalytics platform features in-memory processing for very high throughput of analysis tasks, and data visualisation and exploration tools.
Oracle does provide a Predictive Analytics add-in for Microsoft Excel. This utilises support vector machines and is something of an oddity – but an interesting end-user tool all the same.
A natural route for existing Oracle users with full capability to move into big data. Probably of interest to very large organisations only, and for those without an existing Oracle commitment there are probably more suitable solutions to data mining needs.