This post is co-authored by Jiahang Zhong, Head of Data Science at Zopa.
Zopa is a UK-based digital bank and peer to peer (P2P) lender. In 2005, Zopa launched the first ever P2P lending company to give people access to simpler, better-value loans and investments. In 2020, Zopa received a full bank license to offer people more ways to feel good about money. Since 2005, it has lent out over £5 billion to almost half a million borrowers and generated over £250 million in interest for investors on the platform. Zopa’s key business objectives are to identify quality borrowers, offer competitive credit products to them, and provide great customer experience. Technology and machine learning (ML) are at the core of their business, with applications ranging from credit risk modeling to fraud detection and customer service.
In this post, we use Zopa’s fraud detection system for loans to showcase how Amazon SageMaker Clarify can explain your ML models and improve your operational efficiency.
Every day, Zopa receives thousands of loan applications and lends out millions of pounds to their borrowers. Due to the nature of its products, Zopa is also a target for identity fraudsters. To combat this, Zopa uses advanced ML models to flag suspicious applications for human review, while leaving the majority of genuine applications to be approved by the highly automated system.
Although a primary objective of such models is to achieve great classification performance, another important concern at Zopa is the explainability of these models, for the following reasons:
- As a financial service provider, Zopa is obligated to treat customers fairly and provide reasonable visibility into its automated decisions.
- The data scientists at Zopa need to demonstrate the validity of the model and understand the impact of each input feature.
- The manual review by the underwriters can be quicker if they know why the model has considered a case as suspicious. They can also be more focused in their investigations and reduce friction in the customer experience.
The advanced ML algorithms used in Zopa’s fraud detector can learn the non-linear relationship and interactions between the input features. Instead of a constant proportional effect, an input feature can have different levels of impact on each model prediction.
The data scientists at Zopa often used several traditional feature importance methods to understand the impact of the input features in non-linear ML models, such as the Partial Dependence Plots and Permutation Feature Importance. However, these methods can only provide summary insights about the model for a specific population. For the purposes we described, Zopa needed to explain the contribution of each input feature into an individual model score. SHAP (SHapley Additive exPlanations), based on the concept of a Shapley value from the field of cooperative game theory, works well for such a scenario.
There are multiple explainability techniques for individual inference to choose from, each with their pros and cons. For example, Tree SHAP is only applicable to tree-based models, and Integrated Gradients are specific to deep learning models. LIME is model agnostic but not always robust, and Kernel SHAP is computationally expensive. Because Zopa uses an ensemble of models, including gradient boosted trees and neural networks, the choice of specific explainability technique needs to accommodate the range of models used.
As a contrastive explainability technique, SHAP values are calculated by evaluating the model on synthetic data generated against a baseline sample. The explanations of the same case can be different depending on the choices of this baseline sample. This can be partly due to the distinct distributions of the chosen baseline population, such as their demographics. It can also be mere statistical fluctuation due to the limited size of the baseline sample constrained by the computation expense. Therefore, it’s important for the data scientists at Zopa to try out various choices of baseline samples efficiently.
After the SHAP explanations are produced at the granularity of an individual inference, the data scientists at Zopa also want to have an aggregated view over a certain inference population to understand the overall impact. This allows them to spot common patterns and outliers and adjust the model accordingly.
Why SageMaker Clarify
SageMaker is a fully managed service to prepare, build, train, and deploy high-quality ML models quickly by bringing together a broad set of capabilities purpose-built for ML. SageMaker Clarify provides ML developers with greater visibility into your training data and models so you can identify and limit bias, and explain predictions.
One of the key factors why Zopa chose SageMaker Clarify was due to the benefit of a fully managed service for model explanations with pay-as-you-go billing and the integration with the training and deployment phases of SageMaker.
Zopa trains its fraud detection model on SageMaker and can use SageMaker Clarify to view a feature attributions plot in SageMaker Experiments after the model has been trained. These details may be useful for compliance requirements or can help determine if a particular feature has more influence than it should on overall model behavior.
In addition, SageMaker Clarify uses a scalable and efficient implementation of Kernel SHAP, resulting in performance efficiency and cost savings for Zopa that would be incurred if it managed its own compute resources using the open-source algorithm.
Also, Kernel SHAP is model agnostic, and Clarify supports efficient processing of models with multiple outcomes via Spark-based parallelization. This is important to Zopa because it typically uses a combination of different frameworks like XGBoost and TensorFlow, and requires explainability for each model outcome. SHAP values of individual predictions can be computed via a SageMaker Clarify processing job and made available to the underwriting team to understand individual predictions.
SHAP explanations are contrastive and account for deviations from a baseline. Different baselines can generate different explanations, and SageMaker Clarify allows you to input a baseline of your choice. A non-informative baseline can be constructed as the average or random instance from the training dataset, or an informative baseline can be constructed by setting the non-actionable features to the same value as in the given instance. For more information about baseline choices and settings, see SHAP Baselines for Explainability.
Zopa’s fraud detection models use a few dozen input features, such as application details, device information, interaction behavior, and demographics. For model building, the training dataset was extracted from their Amazon Redshift data warehouse and cleaned up before being stored into Amazon Simple Storage Service (Amazon S3). Because Zopa has its own in-house ML library for both feature engineering and ML framework support, it uses the bring your own container (BYOC) approach to leverage the SageMaker managed services and advanced functionalities such as hyperparameter optimization. The optimized models are then deployed through a Jenkins CI/CD pipeline to the existing production system and serve as a microservice for real-time fraud detection as part of Zopa’s customer-facing platform.
As previously mentioned, model explanations are carried out both during model training for model validation and after deployment for model monitoring and generating insights for underwriters. These are done in a non-customer-facing analytical environment due to heavy computation requirements and high tolerance of latency. Zopa uses SageMaker MMS model serving stack in a similar BYOC fashion to register the models for the SageMaker Clarify processing job. SageMaker Clarify spins up an ephemeral model endpoint and invokes it for millions of predictions on synthetic contrastive data. These predictions are then used to compute SHAP values for each individual case, which are stored in Amazon S3.
As mentioned above, an important parameter of the SHAP explainability technique is the choice of the baseline sample. For the fraud detection model, the primary concern of explanation is on those instances that are classified as suspicious. Zopa’s data scientists use an informative baseline sample from the population of past approved non-fraud applications, to explain why those flagged instances are considered suspicious by the model. With SageMaker Clarify, Zopa can also quickly experiment with baseline samples of different sizes, to determine the final baseline sample which gives low statistical uncertainty while keeping the computation cost reasonable.
For model validation and monitoring, the global feature impact can be examined by the aggregation of SHAP values on the training and monitoring data, which are available in the SageMaker Experiments panel. To give insights to operation, the data scientists filter out the features that contributed to the fraud score positively (a likely fraudster) for each individual case, and report them to the underwriting team in the order of the SHAP value of each feature.
The following diagram illustrates the solution architecture.
For a regulated financial service company like Zopa, it’s important to understand how each factor contributes to its ML model’s decision. Having visibility into the reasoning of the model gives confidence to its stakeholders, both internal and external. It also helps its operations team respond faster and provide a better service to their customers. With SageMaker Clarify, Zopa can now produce model explanations more quickly and seamlessly.
To learn more about SageMaker Clarify, see What Is Fairness and Model Explainability for Machine Learning Predictions?
About the Authors
Hasan Poonawala is a Machine Learning Specialist Solution Architect at AWS, based in London, UK. Hasan helps customers design and deploy machine learning applications in production on AWS. He is passionate about the use of machine learning to solve business problems across various industries. In his spare time, Hasan loves to explore nature outdoors and spend time with friends and family.
Jiahang Zhong is the Head of Data Science at Zopa. He is responsible for data science and machine learning projects across the business, with focus on credit risk, financial crime, operation optimization and customer engagement.