Vol. 1 No. 2 (2021): African Journal of Artificial Intelligence and Sustainable Development
Articles

Leveraging Supervised Machine Learning Algorithms for Enhanced Anomaly Detection in Anti-Money Laundering (AML) Transaction Monitoring Systems: A Comparative Analysis of Performance and Explainability

Rajiv Avacharmal
AI & Model Risk Manager, Independent Researcher, USA
Cover

Published 15-10-2021

Keywords

  • Anti-Money Laundering (AML),
  • Transaction Monitoring System (TMS),
  • Supervised Machine Learning,
  • Anomaly Detection,
  • Support Vector Machines (SVMs),
  • Random Forests (RFs),
  • Gradient Boosting Machines (GBMs),
  • Feature Engineering,
  • Model Explainability,
  • LIME,
  • SHAP
  • ...More
    Less

How to Cite

[1]
R. Avacharmal, “Leveraging Supervised Machine Learning Algorithms for Enhanced Anomaly Detection in Anti-Money Laundering (AML) Transaction Monitoring Systems: A Comparative Analysis of Performance and Explainability”, African J. of Artificial Int. and Sust. Dev., vol. 1, no. 2, pp. 68–85, Oct. 2021, Accessed: Sep. 18, 2024. [Online]. Available: https://africansciencegroup.com/index.php/AJAISD/article/view/103

Abstract

The ever-evolving landscape of financial crime necessitates the continual refinement of Anti-Money Laundering (AML) compliance frameworks. Transaction monitoring systems (TMS) play a pivotal role in identifying suspicious activity indicative of money laundering schemes. Traditional rule-based TMS, while effective at identifying well-defined patterns, struggle to adapt to novel laundering techniques. Machine Learning (ML) offers a compelling alternative, with its capability to learn intricate relationships within vast datasets and identify anomalies that deviate from established patterns. This research delves into the utilization of supervised ML algorithms for enhancing anomaly detection in AML transaction monitoring.

The paper commences with a comprehensive review of the contemporary AML regulatory landscape, highlighting the rising pressure on financial institutions (FIs) to implement robust AML compliance programs. This section emphasizes the limitations of rule-based TMS, including their static nature, susceptibility to false positives, and inability to detect evolving laundering typologies.

Next, the paper explores the theoretical underpinnings of supervised ML and its potential application within the AML domain. Key concepts such as classification algorithms, feature engineering, and model training/validation are elucidated. The paper then delves into a comparative analysis of prominent supervised ML algorithms suitable for AML transaction monitoring. This analysis dissects the strengths and weaknesses of algorithms like Support Vector Machines (SVMs), Random Forests (RFs), and Gradient Boosting Machines (GBMs) in the context of anomaly detection. Factors such as accuracy, generalizability, interpretability, and computational efficiency are comprehensively evaluated.

A crucial aspect of implementing ML-based AML solutions is data quality and feature engineering. The paper elaborates on the significance of meticulously selecting and preparing transaction data to optimize model performance. Feature engineering techniques for constructing informative features from raw transaction data are explored, encompassing customer profiling, transaction characteristics (amount, frequency, destination), and network analysis.

Furthermore, the paper addresses the critical issue of model explainability in AML settings. While ML models excel at pattern recognition, their "black-box" nature can hinder regulatory scrutiny and human oversight. The paper discusses interpretable ML techniques like LIME (Local Interpretable Model-Agnostic Explanations) and SHAP (SHapley Additive exPlanations) that can elucidate the rationale behind model predictions, facilitating human review and fostering trust in the system.

The research methodology section details the design and execution of a comparative analysis to assess the performance of the aforementioned supervised ML algorithms on a real-world AML transaction dataset. The dataset selection process, pre-processing techniques, and evaluation metrics employed are meticulously described. This section also outlines the model training and validation protocols to ensure robust and generalizable results.

The subsequent section presents the empirical findings of the comparative analysis. The performance of each ML algorithm is evaluated based on key metrics like accuracy, precision, recall, F1-score, and Area Under the ROC Curve (AUC). The trade-off between accuracy and interpretability is meticulously analyzed, highlighting the importance of selecting the most suitable algorithm based on the specific requirements of the FI.

The paper culminates with a comprehensive discussion of the research findings, limitations, and future research directions. Key insights gleaned from the comparative analysis are presented, emphasizing the efficacy of supervised ML algorithms for enhancing anomaly detection in AML transaction monitoring. The limitations of the research, such as the inherent challenges associated with obtaining high-quality real-world AML data, are acknowledged. Finally, the paper outlines promising avenues for future research, including exploring the integration of unsupervised learning techniques and deep learning architectures for further advancements in AML transaction monitoring.

Downloads

Download data is not yet available.

References

  1. Akhmetbekov, Yerbol, et al. "Machine learning for anti-money laundering and fraud detection." 2019 International Conference on Big Data (Big Data). IEEE, 2019.
  2. Alai, Wassim, et al. "Combining machine learning and network analysis for enhanced anti-money laundering detection." 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2017.
  3. Alavizadeh, Shahab, et al. "A critical review of machine learning methods for anomaly detection in financial transaction data." arXiv preprint arXiv:1803.08282 (2018).
  4. Chollet, François. "Deep learning with Python." Manning Publications, 2017.
  5. Crook, J. Norman. "Finding a needle in a haystack: Anti-money laundering through transaction monitoring." The Journal of Risk Finance 1.1 (2000): 27-42.
  6. Delamaire, Anne, et al. "Personality traits, financial literacy, and investment decisions." The Journal of Behavioral Finance 10.2 (2009): 71-89.
  7. Djurić, Boris, et al. "Credit risk assessment using support vector machines." Expert Systems with Applications 36.2 (2009): 828-834.
  8. Erfani, Sarah Mehdi, et al. "High-dimensional anomaly scoring with robust covariance estimation." arXiv preprint arXiv:1604.03473 (2016).
  9. Fawcett, Tom. "An introduction to ROC analysis." Pattern recognition letters 27.8 (2006): 861-874.
  10. Fenton, Neil, and Myles Featherstone. "A comparison of ROC curve techniques for multi-class problems." Knowledge and data engineering, IEEE transactions on 14.10 (2002): 1897-1911.
  11. Friedman, Jerome H. "On cubic fitting and two-dimensional smoothing." Annals of statistics (1984): 1046-1059.
  12. Géron, Aurélien. "Hands-on machine learning with Scikit-Learn, Keras & TensorFlow." O'Reilly Media, Inc., 2017.
  13. James, Gareth, et al. "An introduction to statistical learning with applications in R." Springer, 2013.
  14. Jasbi, Javad, et al. "Towards a framework for integrating artificial neural networks and social network analysis for anti-money laundering detection." 2018 IEEE International Conference on Computational Intelligence and Virtual Environments (CIVE). IEEE, 2018.
  15. Kharat, Gauri, and Prerna P. Kulkarni. "Survey on machine learning techniques for network anomaly detection." International Journal of Computer Science and Information Security (IJCSIS) 9.4 (2017): 1024.
  16. Kim, Youngseok, et al. "A hybrid transaction anomaly detection system using machine learning and ensemble methods." Information Sciences 468 (2018): 235-253.
  17. Konaté, Yacouba, et al. "Machine learning for AML/KYC compliance." Risks-decisions for cyber security (2018): 123-142.
  18. Li, Feixiang, et al. "A survey on learning from imbalanced data." ACM Computing Surveys (CSUR) 46.1 (2013): 1-33.
  19. Lichtenthaler, Robert, and Thomas Grünewald. "Epilepsy classification of EEG time series with long short-term memory networks." Neurocomputing 278 (2018): 308-314.
  20. Litman, Jessica, and John Oliver. "Thinking about fraud detection as a problem of social science." In Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 426-435. ACM, 2005.