Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering

Money laundering is the process by which criminals move large sums of illicit money to hidden locations and integrate them as legal funds through existing financial services. The United Nations (UN) estimates that 2 to 5% of global GDP, which is approximately 0.8 to 2.0 trillion dollars,...

Full description

Bibliographic Details
Main Authors: Md. Rezaul Karim, Felix Hermsen, Sisay Adugna Chala, Paola De Perthuis, Avikarsha Mandal
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10486886/
_version_ 1797203816096989184
author Md. Rezaul Karim
Felix Hermsen
Sisay Adugna Chala
Paola De Perthuis
Avikarsha Mandal
author_facet Md. Rezaul Karim
Felix Hermsen
Sisay Adugna Chala
Paola De Perthuis
Avikarsha Mandal
author_sort Md. Rezaul Karim
collection DOAJ
description Money laundering is the process by which criminals move large sums of illicit money to hidden locations and integrate them as legal funds through existing financial services. The United Nations (UN) estimates that 2 to 5% of global GDP, which is approximately 0.8 to 2.0 trillion dollars, is laundered globally every year. Therefore, accurately identifying such globally alarming activities is crucial for enforcing anti-money laundering (AML) measures. Numerous techniques have been proposed to detect money laundering from transaction graphs of money transfers between bank accounts by analysing the structural and behavioural dynamics of their corresponding dense subgraphs. However, these techniques often do not consider that money laundering usually involves high-volume flows of funds through chains of bank accounts. Moreover, most AML approaches either result in lower detection accuracy or incur higher computational costs, making them less reliable and suitable for real financial systems. Consequently, only a fraction of money laundering activities can be detected and prevented. In this paper, we propose an efficient approach to AML by employing semi-supervised graph learning techniques on a large-scale financial transactional graph in both pipeline settings (i.e., graph embedding models are first trained to generate node embeddings that are combined with additional topological graph features to train binary classifiers) and end-to-end settings (i.e., node classification is performed by training SkipGCN, FastGCN, and EvolveGCN without requiring separate classifiers) to identify nodes involved in potential money laundering activities. We evaluate our approach on four datasets: AMLSim, Elliptic, IBM AML, and SynthAML, with a view to scalability and practicality for real financial systems. Further, we provide local (e.g., how money is laundered between nodes) and global (e.g., what factors contribute to money laundering) explanations of the predictions by highlighting the predominant factors in money laundering cases and elucidating the mechanisms of illicit fund transfers between nodes to enhance the interpretability and transparency of the AML models. Experimental results suggest that our approach is scalable and effective at detecting money laundering from real and synthetic transaction graphs.
first_indexed 2024-04-24T08:25:20Z
format Article
id doaj.art-f6eef6d2d897412fa9e92e17c02b7455
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-24T08:25:20Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-f6eef6d2d897412fa9e92e17c02b74552024-04-16T23:00:11ZengIEEEIEEE Access2169-35362024-01-0112500125002910.1109/ACCESS.2024.338378410486886Scalable Semi-Supervised Graph Learning Techniques for Anti Money LaunderingMd. Rezaul Karim0Felix Hermsen1Sisay Adugna Chala2Paola De Perthuis3Avikarsha Mandal4Information Systems and Databases, RWTH Aachen University, Aachen, GermanyInformation Systems and Databases, RWTH Aachen University, Aachen, GermanyInformation Systems and Databases, RWTH Aachen University, Aachen, GermanyÉcole Normale Supérieure (ENS), Paris, FranceDepartment of Data Science and Artificial Intelligence, Fraunhofer FIT, Sankt Augustin, GermanyMoney laundering is the process by which criminals move large sums of illicit money to hidden locations and integrate them as legal funds through existing financial services. The United Nations (UN) estimates that 2 to 5% of global GDP, which is approximately 0.8 to 2.0 trillion dollars, is laundered globally every year. Therefore, accurately identifying such globally alarming activities is crucial for enforcing anti-money laundering (AML) measures. Numerous techniques have been proposed to detect money laundering from transaction graphs of money transfers between bank accounts by analysing the structural and behavioural dynamics of their corresponding dense subgraphs. However, these techniques often do not consider that money laundering usually involves high-volume flows of funds through chains of bank accounts. Moreover, most AML approaches either result in lower detection accuracy or incur higher computational costs, making them less reliable and suitable for real financial systems. Consequently, only a fraction of money laundering activities can be detected and prevented. In this paper, we propose an efficient approach to AML by employing semi-supervised graph learning techniques on a large-scale financial transactional graph in both pipeline settings (i.e., graph embedding models are first trained to generate node embeddings that are combined with additional topological graph features to train binary classifiers) and end-to-end settings (i.e., node classification is performed by training SkipGCN, FastGCN, and EvolveGCN without requiring separate classifiers) to identify nodes involved in potential money laundering activities. We evaluate our approach on four datasets: AMLSim, Elliptic, IBM AML, and SynthAML, with a view to scalability and practicality for real financial systems. Further, we provide local (e.g., how money is laundered between nodes) and global (e.g., what factors contribute to money laundering) explanations of the predictions by highlighting the predominant factors in money laundering cases and elucidating the mechanisms of illicit fund transfers between nodes to enhance the interpretability and transparency of the AML models. Experimental results suggest that our approach is scalable and effective at detecting money laundering from real and synthetic transaction graphs.https://ieeexplore.ieee.org/document/10486886/Anti-money launderingmachine learning on graphsgraph embedding
spellingShingle Md. Rezaul Karim
Felix Hermsen
Sisay Adugna Chala
Paola De Perthuis
Avikarsha Mandal
Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
IEEE Access
Anti-money laundering
machine learning on graphs
graph embedding
title Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
title_full Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
title_fullStr Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
title_full_unstemmed Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
title_short Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
title_sort scalable semi supervised graph learning techniques for anti money laundering
topic Anti-money laundering
machine learning on graphs
graph embedding
url https://ieeexplore.ieee.org/document/10486886/
work_keys_str_mv AT mdrezaulkarim scalablesemisupervisedgraphlearningtechniquesforantimoneylaundering
AT felixhermsen scalablesemisupervisedgraphlearningtechniquesforantimoneylaundering
AT sisayadugnachala scalablesemisupervisedgraphlearningtechniquesforantimoneylaundering
AT paoladeperthuis scalablesemisupervisedgraphlearningtechniquesforantimoneylaundering
AT avikarshamandal scalablesemisupervisedgraphlearningtechniquesforantimoneylaundering