It’s just too easy to become enamoured with sexy data mining algorithms. Throw data that has been generated randomly into a support vector machine or neural network and chances are they will make some (non)sense of it.
If an analyst uses these algorithms correctly then a whole set of checking and validation will take place to make sure that patterns are not just ghosts in the data set, but have some existence in reality. It is at this point however that disappointment may set in. There may be a distinct lack of useful, verified patterns despite churning through dozens of algorithms and permutations of various parameters.
The secret of data mining (and by proxy predictive analytics) is feature engineering. This pompous name represents the art of understanding your data, understanding the problems you wish to solve, and understanding your data mining tools. And as a result of this understanding, presenting data to the data mining process in the most appropriate manner. Simple examples are always best. One of the classic problems in data mining is how to represent a person’s age. If our data mining task is to determine car driver insurance risk then it might be better to transform a continuous age variable into a category – young/middle age/old. More than this it might be necessary to combine variables into a composite – so find some combination of age and sex for example to incorporate the fact that women tend to be safer drivers than men (in the UK at least).
It serves no useful purpose to indulge flights of fancy here, and the manipulation of attributes in your data should be based on an understanding of the data and the tools you use. This is called feature engineering, and it proves that in data mining the machinery is essentially dumb, and extraction of meaning is down to human understanding. In my experience it is skillful feature engineering that has the biggest payback.