Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering

Money laundering is the process by which criminals move large sums of illicit money to hidden locations and integrate them as legal funds through existing financial services. The United Nations (UN) estimates that 2 to 5% of global GDP, which is approximately 0.8 to 2.0 trillion dollars,...

Full description

Bibliographic Details
Main Authors: Md. Rezaul Karim, Felix Hermsen, Sisay Adugna Chala, Paola De Perthuis, Avikarsha Mandal
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10486886/
Description
Summary:Money laundering is the process by which criminals move large sums of illicit money to hidden locations and integrate them as legal funds through existing financial services. The United Nations (UN) estimates that 2 to 5% of global GDP, which is approximately 0.8 to 2.0 trillion dollars, is laundered globally every year. Therefore, accurately identifying such globally alarming activities is crucial for enforcing anti-money laundering (AML) measures. Numerous techniques have been proposed to detect money laundering from transaction graphs of money transfers between bank accounts by analysing the structural and behavioural dynamics of their corresponding dense subgraphs. However, these techniques often do not consider that money laundering usually involves high-volume flows of funds through chains of bank accounts. Moreover, most AML approaches either result in lower detection accuracy or incur higher computational costs, making them less reliable and suitable for real financial systems. Consequently, only a fraction of money laundering activities can be detected and prevented. In this paper, we propose an efficient approach to AML by employing semi-supervised graph learning techniques on a large-scale financial transactional graph in both pipeline settings (i.e., graph embedding models are first trained to generate node embeddings that are combined with additional topological graph features to train binary classifiers) and end-to-end settings (i.e., node classification is performed by training SkipGCN, FastGCN, and EvolveGCN without requiring separate classifiers) to identify nodes involved in potential money laundering activities. We evaluate our approach on four datasets: AMLSim, Elliptic, IBM AML, and SynthAML, with a view to scalability and practicality for real financial systems. Further, we provide local (e.g., how money is laundered between nodes) and global (e.g., what factors contribute to money laundering) explanations of the predictions by highlighting the predominant factors in money laundering cases and elucidating the mechanisms of illicit fund transfers between nodes to enhance the interpretability and transparency of the AML models. Experimental results suggest that our approach is scalable and effective at detecting money laundering from real and synthetic transaction graphs.
ISSN:2169-3536