Most businesses simply want to buy and make things for the lowest possible cost, and sell them to as many customers as possible with the highest possible margin. And that is pretty well it. Now to achieve this, information technology has been used to reduce the cost of transactional activity, typically through labor displacement. More recently however many businesses have discovered that the technology can be used as more than a glorified filing cabinet and desktop calculator. It can also be used to automate business decisions – either completely or partially. The mechanism for doing this is through analytics – predictive analytics, business rule management, optimization and business intelligence. Most excitement centers on predictive analytics, and the use of data mining technologies to find patterns in historical data which might be useful in future activities. Examples include loan approval, fraud detection, customer targeting, predicting machine failure, hospital readmission, and even when an employee might resign.
Because the finding of predictive patterns is a fairly technical affair, and also littered with many potholes, a new profession has arisen – that of the data scientist. A knowledge of statistics, data mining and other difficult topics means these individuals can attract high salaries, and forecasts of significant shortfalls of such people only adds to the desirability of such a profession. But a cursory look in the rearview mirror will show that we’ve been here before, and that the technology industry eventually commoditizes difficult technologies into products that the average organization can use without too much fuss.
Once upon a time it was necessary for businesses wanting to use computers to employ an assembly language expert – someone who could get into the guts of the computer to make it do what the organization needed. Today this is typically not the case. Prior to the emergence of Enterprise Resource Planning (ERP) applications many businesses employed people with knowledge of payroll, accounts, purchasing, sales and other systems, and others who needed to glue these disparate systems together. Again, this need has been much diminished since the emergence of ERP systems.
For the time being businesses will need data scientists to build the predictive models which help with future business decisions. However suppliers already exist who offer some well-defined predictive solutions with absolutely no need to understand the inner workings. There are dangers associated with this approach, but even over the short history of such offerings the level of sophistication has risen considerably.
The forecasts of dire shortages of data scientists are over exaggerated. If every business needed to totally reinvent its predictive models, then yes, data scientists would be in very short supply. In reality however most businesses will do what they always do and simply buy a solution. Of course larger businesses will always need some people who have in-depth technical understanding, but many others may not.
To use the well-worn analogy, you don’t need to understand quantum mechanics to turn your TV on (all semiconductor devices depend on quantum mechanics). And so the average business will not need to know about data science to use predictive models, although some may choose to do so.
Many ‘data scientists’ will end up working for solutions suppliers, and probably rather less within end user businesses. Business management will demand tools that allow them to manage, monitor and possibly even change the predictive models that are being used. These models will after all be increasingly determining the fate of the business.
And so the ‘Death of the Data Scientist’ is a bit of an overstatement, but the specter of hordes of people fine tuning neural networks and support vector machine based models probably isn’t going to happen in many organizations. The commoditization of complex technologies is as immutable as the law of gravity, and data science will be no exception.