-By Joana Lorenz , Maria Inês Silva, David Aparício, João Tiago Ascensão, Pedro Bizarro
Feedzai
Abstract
Every year, criminals launder billions of dollars acquired from serious felonies (e.g., terrorism, drug smuggling, or human trafficking)
harming countless people and economies. Cryptocurrencies, in
particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns.
However, labels are so scarce that traditional supervised algorithms
are inapplicable. Here, we address money laundering detection
assuming minimal access to labels. First, we show that existing
state-of-the-art solutions using unsupervised anomaly detection
methods are inadequate to detect the illicit patterns in a real Bitcoin transaction dataset.
Image courtesy: BitCoin Magazine
In the financial sector, Anti-Money Laundering (AML) efforts often rely on rule-based systems. However, vulnerabilities derive
from the relative simplicity of publicly available rule-sets, leading
to high false-positive rates (FPR) and low detection rates. Machine learning (ML) techniques overcome the rigidity of rule-based
systems by inferring complex patterns from historical data, and can
potentially increase detection rates and decrease FPRs.
How to detect money
laundering in a dataset with few labels.
- Detecting money laundering cases in the Bitcoin network without any labels is impossible since illicit transactions hide within clusters of licit behavior.
- With just a few labels (approximately 5% of the total), one can match the results of a supervised baseline by using Active Learning (AL). This setting mimics a real-world scenario with limited availability of human analysts for manual labeling.
the Bitcoin dataset1
released by Elliptic, a company dedicated to detecting financial crime in cryptocurrencies. It includes 49 graphs sampled from the Bitcoin blockchain at different
sequential moments in time (time-steps), as presented in Figure 1.
Each graph is a directed acyclic graph, starting from one transaction,
and including subsequent related transactions on the blockchain,
containing approximately two weeks of data.
Bitcoins transactions are transfers from one Bitcoin address (e.g.,
a person or company) to another, represented as nodes in the graph.
Each transaction consumes the output of past transactions and generates outputs that can be spent by future transactions. The edges
in the graph represent the flow of Bitcoins between transactions.
The dataset consists of 203,769 transactions, of which 21% are
labeled as licit, and 2% as illicit, based on the category of the bitcoin
address that created the transaction. The remaining transactions
are unlabeled. Illicit categories include scams, malware, terrorist organizations, and Ponzi schemes. Licit categories include exchanges,
wallet providers, miners, and licit services. Each transaction has 166
features, 94 of which represent information about the transaction
itself. The remaining features were constructed by Weber et al. using information one-hop backward/forward from the transaction,
such as the minimum, maximum, and standard deviation of each
transaction feature. All features, except for the time-step, are fully
anonymized and standardized with zero mean and unit variance.
Unsupervised Learning.
Anomaly detection methods are unsupervised learning techniques to detect outliers in a dataset. Literature suggests their effectiveness in the AML context.
Tested seven common anomaly detection algorithms with readily available Python implementations:
- Local Outlier Factor (LOF)
- K-Nearest Neighbours (KNN)
- Principal Component Analysis (PCA)
- One-Class Support Vector Machine (OCSVM)
- Cluster-based Outlier Factor (CBLOF)
- Angle-based Outlier Detection (ABOD ref: my earlier post)
- Isolation Forest (IF).
Active Learning
AL is an incremental learning approach
that interactively queries instances for labeling (e.g., by human
analysts) and uses the increasing number of labeled instances to
(re-)train a supervised model. It fits the AML context by addressing
label scarcity and has previously been successfully applied to detect
money laundering accounts based on financial transaction history.
For an extensive survey on AL, we refer the reader to Settles.
The goal of AL is to minimize the number of labels necessary to
achieve adequate classifier performance. The process starts with
a pool of unlabeled instances (the unlabeled pool), although sometimes there is a residual number of labels. At each iteration, a query
strategy queries a batch of instances for manual labeling. After
labeling, the instances go into the labeled pool. Finally, a supervised algorithm (the classifier) is trained on the labeled pool and
evaluated on a test set. If the performance is not satisfactory, the
querying process continues to enrich the labeled pool incrementally. To mimic the manual labeling process in our experiments, we
append the labels to the queried instances.
Conclusions
Results indicate that unsupervised anomaly detection methods have poor performance, and we present evidence that anomalies
in the feature-space are not indicative of illicit behaviour. This finding highlights that experiments conducted on (partially) synthetic
data can be misleading and emphasizes the importance of conducting experiments on real-life datasets to draw reliable conclusions.
Comments