Each predictive model carries a certain amount of “metadata”, such as its structural type, the date range of data sample used for its construction, its segmentation criteria and how the model is being used in production. Some of these relate entirely to how the model was created, while others reflect changing circumstances as time progresses. Regulators are keen to make sure that all these variables are understood, and that the processes, which employ predictive models, are equally well understood. This is not a trivial requirement, and eight of the major issues are outlined below.
Data Sample Preparation
Regulators require you demonstrate that your model validation sampling techniques are complete, responsible and relevant. To avoid over-fitting it is necessary that data used to validate a model be completely independent of that used to initially develop a model. The sample size should also be large enough to deliver sufficient numbers of various outcome classes of interest (a minimum of 300 for each class of interest is often recommended). It is also necessary to consider the economic, market and product conditions existing when samples were generated. Finally data hygiene (data quality and how issues such as missing values and outliers are dealt with) and data bias (for example the decision strategies that filter new applicants) must be accounted for, and regulators will expect a robust defence of practices in these domains.
Segmentation Transparency
Regulators require that you clearly document how you segmented the subpopulations within your portfolio and how you determined the unique actions you took against each. They will be interested whether a subpopulation is defined empirically or by a domain expert. They will of course be vigilant that no discriminatory variables are used here, such as race and sex. The key to successful segmentation is in identifying the right variables to split a population into actionable segments. Automated tools and techniques now make this process significantly faster and easier.
Model Types
The driving principle here is that ‘things should be as simple as possible but not simpler’. Models should be easy to understand and explain for customers and regulators. They should also exhibit palatability, and be intuitive and adhere to a level of common sense. A number of model types are less suitable in this respect, including neural networks and support vector machines. Others are more suitable, and specifically decision trees, scorecards and some clustering methods.
Model Effectiveness
Model validation is carried out to ensure that model performs according to business objectives. This should happen at least once a year, and depending on the volatility of markets and economies, possibly more often. Regulators want to see that you validate on a consistent and reliable basis, and that your process is repeatable – and can be communicated clearly. Regulators also want to know what threshold metrics you’ve put in place and actions you are taking (such as more frequent reassessment, recalibration or rebuilding) when a model falls below an identified threshold. In the US, the OCC/Fed requires parties independent of those developing the model, and designing and implementing the validation process, review your validation processes. Globally, Basel puts an equally strong emphasis on governance. An independent reviewer should have the authority to challenge model developers, so this input is considered carefully rather than summarily overruled. Your validation checklist should include standard measures (K-S, divergence, ROC area, Gini coefficient, etc.), along with metrics that ensure the model rank orders by score range. You should compute and analyse these metrics at least quarterly, and ideally monthly, in order to quickly identify changes.
Performance Tracking
Over time, many factors can impact model performance. These include shifts in population makeup or behaviour, economic changes, and changes to credit and collection policies. Regulators expect you to monitor models on a continual basis so you can recalibrate and rebuild them in a timely manner and modify your strategies accordingly. A variety of reports help in this respect, including:
- Population Stability Report. This report answers, “Is my population scoring differently than the development population or other baseline measure?”
- Characteristic Analysis Report. If you’re detecting shifts in score distribution, this report can help explain why. It determines which variables in a model are scoring differently at the attribute level, and how many points are being added or lost compared to the baseline for each characteristic.
- Delinquency Distribution Report. This report illustrates the scorecard’s effectiveness at rank ordering accounts by risk. It demonstrates the relationship between delinquency and score for accounts within a particular time period.
- Vintage Analysis Report. This report compares a series of current delinquency distribution reports, isolating accounts of similar “time on books.” This enables you to spot trends earlier than if you only analysed total portfolio results.
- Odds-to-Score Report. For binary outcome models, the probability of negative outcomes per score band may shift over time and threaten the business value of decisions made at each score range. This should be monitored regularly to detect shifts and rotations in this important relationship.
- Time Series of All Key Metrics. Measure key metrics for each validation period’s window. To detect new trends in model stability and performance, you should review these measures across all time periods, from baseline metrics through each subsequent validation.
Monitoring of Overrides
Anytime you override a score, regulators will require that you document and monitor that decision carefully. Your overrides should be based on clear and consistent guidelines. Regulators will ask questions such as: What is your cutoff for an override? What authority level do you require for override approval? How many overrides are you doing every month?
Defence of Decision Strategies
No matter how complex your decision strategies, regulators will expect you to explain and defend them with empirical results. Regulators will want to know how you develop, track and implement your strategies. You must also show the results of your strategies, including the realised losses, gains and exposures arising from your decisions. Most importantly, regulators will want to know how you balance the need to increase profits with the need to contain risk.
Document Thoroughly
Regulators worldwide place tremendous importance on documentation and oversight. When a regulator asks you for proof of when you last ran a validation report, who approved the report and what action you took, you need the right tools in place to quickly retrieve the supporting evidence.
With that in mind, you should keep an inventory of every model within your operating environment, cataloguing its purpose, usage and restrictions on use.
List the types and sources of inputs. Your documentation should be detailed enough so that anyone unfamiliar with the model can understand how it operates, its limitations and your key assumptions. You also should be able to retrieve documentation for any vendor- supplied models, and demonstrate that you understand it.
This in many ways is the price paid for extensive use of predictive models. It shifts the management overhead from individual transactions to the mechanisms (in this case predictive models) that allow transactions to be largely automated. It’s a strategy with a very substantial return provided the mechanics of automation (the models) are well developed, well understood and well managed. Regulatory requirements are simply ensuring that this is indeed the case, and to achieve this end organisations can choose to automate the management of models by using a suitable set of tools. In reality there really is no choice, and as time progresses the lack of suitable tools, infrastructure and processes will severely inhibit the use of predictive technologies, and increase the risk of running foul of regulatory requirements.
Previous article in this series is Predictive Models – Risks and Benefits
The next article in this series is Model Management Tools