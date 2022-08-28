Machine learning models make various decisions, whether it is rejecting loans, determining if a patient is sick, or deciding whether a suspect is guilty of a certain crime. Since these are statistical models, they have an inherent component of potential errors. Some of the examples of such errors and biases which were exposed recently were Apple card’s fair lending fiasco and a project funded by the UK government that used AI to predict gun and knife crime.

In order to trust machine learning models, humans need explanations. For example, it makes sense for a loan to be rejected due to low income, but if a loan is rejected due to the client’s zip code, this might indicate there’s bias in the model. In addition, if a loan is rejected because the distance between the owner and their business is negative, this is an indication that there’s a bug in the distance feature.

When choosing a machine learning algorithm, there’s usually a trade-off between the algorithm’s interpretability and its accuracy. Traditional methods such as decision trees and linear regression can be directly explained, but their ability to provide accurate predictions is limited. More modern methods such as Random Forests and Neural Networks give better predictions but are more difficult to interpret.

The field of machine learning model interpretations has seen great advances in the last few years, with methods such as Lime and SHAP. Some background is required to fully understand these methods but analyzing the underlying data can provide a simple and intuitive interpretation. For this we first need to understand: How do humans reason?

Let’s review the common example of the rooster’s crow: If you grew up in the countryside, you might know that roosters always crow before the sun rises. Can we infer that the rooster’s crow makes the sun rise? It’s clear that the answer is no. But, why?

Humans have a mental model of reality. We know that if a rooster didn’t crow, the sun would still rise. This type of reasoning is called a counterfactual.

Counterfactuals

This is the common way in which humans make sense of reality. It cannot be scientifically proven - the concept of Descartes’ demon illustrates this idea: According to this concept if event B happens right after event A, one can never be sure that there isn’t some evil demon that causes B to happen right after A. However, since humans will reason this way anyway - it makes sense to try and formalize this process. If you are curious about this topic, read “The Book of Why” by Judea Pearl, a prominent computer science researcher and philosopher.

Application in real life

At my company, we have predictive models that are aimed at an assessment of the client’s risk when they apply for a loan. The model uses historical data in a tabular format, in which each client has a list of meaningful features such as payments history, income and incorporation date. Using this data, we predict the client’s level of risk and divide it into 6 different risk groups (buckets). We interpret the models’ predictions using both local and global explanations.

Local explanations are aimed to explain a single prediction. Each feature’s value is replaced with the median in the representative population, and the feature that caused the largest change in score was chosen to be displayed in a textual explanation. For example:

A Longer delay in scheduled payments deteriorated the client’s risk level.

or in its more detailed version:

The Client had a 21-day delay in payments compared to a median delay of 0 in the population. This caused the client’s risk level to deteriorate from level D to E.

Global explanations

Global explanations are aimed to explain the features’ direction in the model as a whole. An individual feature value is replaced with one extreme value. This value can be, for example, the 95th percentile - i.e., almost the largest value in the sample (95% of the values are smaller than it).

The changes in the scores’ distribution are calculated and visualized in the chart below. The figure shows the change in the client’s risk level when increasing the value to the 95th percentile.

When increasing the first listed feature (length of delay in payments) to the 95th percentile, a large portion of the clients has their risk level deteriorate one or more levels. A person who reviews this behaviour can easily accept that a delay in payments is expected to cause a worse risk level.

The second feature, monthly balance increase, has a combined effect - a small percentage of the clients have their risk level deteriorate, while a larger percentage have their risk level improve. This combined effect might indicate there’s some interaction between features, although that is not something that can be directly exposed in this method.

The third feature, years since incorporation, has a positive effect on the client’s risk level when increasing it to the 95% percentile. Here too, it can be easy to accept that businesses that have been around for longer periods are likely to be more stable and therefore present a lower risk level.

The counterfactual approach allows for simple and intuitive reasoning that anyone can understand, and hopefully can increase the trust humans place in machine learning models.

Written by Nathalie Hauser, Manager, Data Science at BlueVine.