Real-time fraud detection using AWS serverless and machine learning services

Online fraud has a widespread impact on businesses and requires an effective end-to-end strategy to detect and prevent new account fraud and account takeovers, and stop suspicious payment transactions. Detecting fraud closer to the time of fraud occurrence is key to the success of a fraud detection and prevention system. The system should be able to detect fraud as effectively as possible also alert the end-user as quickly as possible. The user can then choose to take action to prevent further abuse.

In this post, we show a serverless approach to detect online transaction fraud in near-real time. We show how you can apply this approach to various data streaming and event-driven architectures, depending on the desired outcome and actions to take to prevent fraud (such as alert the user about the fraud or flag the transaction for additional review).

This post implements three architectures:

To detect fraudulent transactions, we use Amazon Fraud Detector, a fully managed service enabling you to identify potentially fraudulent activities and catch more online fraud faster. To build an Amazon Fraud Detector model based on past data, refer to Detect online transaction fraud with new Amazon Fraud Detector features. You can also use Amazon SageMaker to train a proprietary fraud detection model. For more information, refer to Train fraudulent payment detection with Amazon SageMaker.

Streaming data inspection and fraud detection/prevention

This architecture uses Lambda and Step Functions to enable real-time Kinesis data stream data inspection and fraud detection and prevention using Amazon Fraud Detector. The same architecture applies if you use Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a data streaming service. This pattern can be useful for real-time fraud detection, notification, and potential prevention. Example use cases for this could be payment processing or high-volume account creation. The following diagram illustrates the solution architecture.

Streaming data inspection and fraud detection/prevention architecture diagram

The flow of the process in this implementation is as follows:

  1. We ingest the financial transactions into the Kinesis data stream. The source of the data could be a system that generates these transactions—for example, ecommerce or banking.
  2. The Lambda function receives the transactions in batches.
  3. The Lambda function starts the Step Functions workflow for the batch.
  4. For each transaction, the workflow performs the following actions:
    1. Persist the transaction in an Amazon DynamoDB table.
    2. Call the Amazon Fraud Detector API using the GetEventPrediction action. The API returns one of the following results: approve, block, or investigate.
    3. Update the transaction in the DynamoDB table with fraud prediction results.
    4. Based on the results, perform one of the following actions:
      1. Send a notification using Amazon Simple Notification Service (Amazon SNS) in case of a block or investigate response from Amazon Fraud Detector.
      2. Process the transaction further in case of an approve response.

This approach allows you to react to the potentially fraudulent transactions in real time as you store each transaction in a database and inspect it before processing further. In actual implementation, you may replace the notification step for additional review with an action that is specific to your business process—for example, inspect the transaction using some other fraud detection model, or conduct a manual review.

Streaming data enrichment for fraud detection/prevention

Sometimes, you may need to flag potentially fraudulent data but still process it; for example, when you’re storing the transactions for further analytics and collecting more data for constantly tuning the fraud detection model. An example use case is claims processing. During claims processing, you collect all the claims documents and then run them through a fraud detection system. A decision to process or reject a claim is then made—not necessarily in real time. In such cases, streaming data enrichment may fit your use case better.

This architecture uses Lambda to enable real-time Kinesis Data Firehose data enrichment using Amazon Fraud Detector and Kinesis Data Firehose data transformation.

This approach doesn’t implement fraud prevention steps. We deliver enriched data to an Amazon Simple Storage Service (Amazon S3) bucket. Downstream services that consume the data can use the fraud detection results in their business logics and act accordingly. The following diagram illustrates this architecture.

Streaming data enrichment for fraud detection/prevention architecture diagram

The flow of the process in this implementation is as follows:

  1. We ingest the financial transactions into Kinesis Data Firehose. The source of the data could be a system that generates these transactions, such as ecommerce or banking.
  2. A Lambda function receives the transactions in batches and enriches them. For each transaction in the batch, the function performs the following actions:
    1. Call the Amazon Fraud Detector API using the GetEventPrediction action. The API returns one of three results: approve, block or investigate.
    2. Update transaction data by adding fraud detection results as metadata.
    3. Return the batch of the updated transactions to the Kinesis Data Firehose delivery stream.
  3. Kinesis Data Firehose delivers data to the destination (in our case, the S3 bucket).

As a result, we have data in the S3 bucket that includes not only original data but also the Amazon Fraud Detector response as metadata for each of the transactions. You can use this metadata in your data analytics solutions, machine learning model training tasks, or visualizations and dashboards that consume transaction data.

Event data inspection and fraud detection/prevention

Not all data comes into your system as a stream. However, in cases of event-driven architectures, you still can follow a similar approach.

This architecture uses Step Functions to enable real-time EventBridge event inspection and fraud detection/prevention using Amazon Fraud Detector. It doesn’t stop processing of the potentially fraudulent transaction, rather it flags the transaction for an additional review. We publish enriched transactions to an event bus that differs from the one that raw event data is being published to. This way, consumers of the data can be sure that all events include fraud detection results as metadata. The consumers can then inspect the metadata and apply their own rules based on the metadata. For example, in an event-driven ecommerce application, a consumer can choose to not process the order if this transaction is predicted to be fraudulent. This architecture pattern can also be useful for detecting and preventing fraud in new account creation or during account profile changes (like changing your address, phone number, or credit card on file in your account profile). The following diagram illustrates the solution architecture.

Event data inspection and fraud detection/prevention architecture diagram

The flow of the process in this implementation is as follows:

  1. We publish the financial transactions to an EventBridge event bus. The source of the data could be a system that generates these transactions—for example, ecommerce or banking.
  2. The EventBridge rule starts the Step Functions workflow.
  3. The Step Functions workflow receives the transaction and processes it with the following steps:
    1. Call the Amazon Fraud Detector API using the GetEventPrediction action. The API returns one of three results: approve, block, or investigate.
    2. Update transaction data by adding fraud detection results.
    3. If the transaction fraud prediction result is block or investigate, send a notification using Amazon SNS for further investigation.
    4. Publish the updated transaction to the EventBridge bus for enriched data.

As in the Kinesis Data Firehose data enrichment method, this architecture doesn’t prevent fraudulent data from reaching the next step. It adds fraud detection metadata to the original event and sends notifications about potentially fraudulent transactions. It may be that consumers of the enriched data don’t include business logics that use fraud detection metadata in their decisions. In that case, you can change the Step Functions workflow so it doesn’t put such transactions to the destination bus and routes them to a separate event bus to be consumed by a separate suspicious transactions processing application.

Implementation

For each of the architectures described in this post, you can find AWS Serverless Application Model (AWS SAM) templates, deployment, and testing instructions in the sample repository.

Conclusion

This post walked through different methods to implement a real-time fraud detection and prevention solution using Amazon Machine Learning services and serverless architectures. These solutions allow you to detect fraud closer to the time of fraud occurrence and act on it as quickly as possible. The flexibility of the implementation using Step Functions allows you to react in a way that is most appropriate for the situation and also adjust prevention steps with minimal code changes.

For more serverless learning resources, visit Serverless Land.


About the Authors

Veda Raman is a Senior Specialist Solutions Architect for machine learning based in Maryland. Veda works with customers to help them architect efficient, secure and scalable machine learning applications. Veda is interested in helping customers leverage serverless technologies for Machine learning.

Giedrius PraspaliauskasGiedrius Praspaliauskas is a Senior Specialist Solutions Architect for serverless based in California. Giedrius works with customers to help them leverage serverless services to build scalable, fault-tolerant, high-performing, cost-effective applications.

Read More