Working as Data Scientists, we have to solve various problems on a daily basis. Some of them are tasks we've completed many times before and some of them are completely new to us and we have to understand the business logic behind them before starting building models. The latter was the case when our client, a bank, approached us a couple of months ago with a problem they were tackling – whether to provide or deny loans to gamblers.
As you are probably aware, banks keep records of all of the transactions you make. From these records, they know for example where you made the transaction, what was the amount you spent, and the location where the transaction happened. This information about clients can be very useful for bankers in the loan decision process – they are able to estimate client's behaviour and ability to pay a loan on time. If they spot an irregularity in a client's behaviour, they can mark them as a risky client and give them a worse interest rate or even reject their loan application.
The odds: Gamblers = risky clients (but not always)
One of the signs of a risky client could be gambling. If a client spends any amount of money (even if it's only a few euros) on gambling or betting they are automatically identified as a gambler. But why are banks so meticulous when it comes to identifying gamblers? The answer lies in the numbers. The default rate for all loans the bank provides is around 2%. Now let's focus only on clients identified as gamblers and check their default rate. It may or may not come as a surprise to you that gamblers' default rate is more than twice as high, which could easily cause significant financial trouble to the bank. On the other hand, rejecting all applicants for a loan identified as gamblers would be unfair to clients and the bank would lose a lot of potential profit from loans.
'Good' gamblers vs. 'bad' gamblers
How to distinguish the 'good' gamblers from the 'bad' gamblers? And how to find a suitable solution which would be beneficial both for the bank and also for its clients? Our solution was to build a machine learning classification model which discovered hidden structures in the data. With this model, we were able to tell which client was only an occasional 'gambler', spending a few euros on a roulette in a local pub while just hanging out with friends and enjoying a beer (i.e. a 'good gambler') and who was a gambling addict spending a huge amount of money on gambling or betting every few days (e.i. 'a bad gambler').
This classification helped the bank better understand their clients and make more correct decisions concerning the loans. In the first place, our model identified a small group of clients with a default rate of ~80%, whose loan application can automatically be rejected which will save the bank tens of thousands of euros on defaulted loans annually. Moreover, the model also identified a larger group of clients with a default rate of ~30-40%. These loan applications need to be handled manually, but if the process is set up correctly, this can lead to further loan income for the bank and extra savings on defaulted loans. And that was the goal. Achieve a win-win situation – for the bank and also for its clients.
If you have a similar problem in your business or would like to learn more about our solution, let us know.