Application of Machine Learning Algorithms in Detecting Credit Card Fraud: A Comparative Analysis

Authors

  • Yiting Ren

DOI:

https://doi.org/10.54097/hbem.v21i.14753

Keywords:

Credit card fraud; machine learning; class imbalance; XG Boost.

Abstract

Credit card transaction have grown increasingly prevalent in the digital era, and along with them, so have incidents of associated fraud. Hence, identification and prevention of such frauds are critically crucial. Machine learning algorithms are predominantly employed in the realm of credit fraud detection. According to current literature, class imbalance of data, a great disparity in ratio between normal and fraudulent transactions, could severely affect the result in detection. In this paper, a combination of imbalanced classification methods, specifically the Synthetic Minority Random Oversampling Technique (SMOTE) and Under-sampling, is utilized to harmonize the dataset. Some popular machine learning algorithms are applied to detect frauds are compared and analyzed, including Logistic Regression, Decision Tree, Random Forest and XG Boost. The accuracy, precision, recall, F-1 score and Area Under Curve (AUC) of each algorithm are used as metrics of performance evaluation. The research findings indicated that among the four models tested, XG Boost, when coupled with balanced data yielded overall optimal results for classifying fraudulent activities.

Downloads

Download data is not yet available.

References

Nicole Long, Ashely Donohoe. The Importance of Credit Cards, 2019. Retrieved from: https://budgeting.thenest.com/importance-credit-cards-29514.html.

Toplin, J. Spotlight: US Card Payment Fraud Losses Forecast 2022, 2022. Retrieved from: https://www.insiderintelligence.com/content/spotlight-us-card-payment-fraud-losses-forecast-2022.

Mittal, S., & Tyagi, S. Performance evaluation of machine learning algorithms for credit card fraud detection. In 2019 9th International Conference on Cloud Computing, Data Science & Engineering, 2019: 320-324.

Itoo, F., Meenakshi, & Singh, S. Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection. International Journal of Information Technology, 2021, 13: 1503-1511.

Gupta, P., Varshney, A., Khan, M. R., Ahmed, R., Shuaib, M., & Alam, S. Unbalanced Credit Card Fraud Detection Data: A Machine Learning-Oriented Comparative Study of Balancing Techniques. Procedia Computer Science, 2023, 218: 2575-2584.

Makki, S., Assaghir, Z., Taher, Y., Haque, R., Hacid, M. S., & Zeineddine, H. An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access, 2019, 7: 93010-93022

Kaggle. Retrieved from https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.

Wasikowski, M., & Chen, X. W. Combating the small sample class imbalance problem using feature selection. IEEE Transactions on knowledge and data engineering, 2019, 22(10): 1388-1400.

Mohammed, R., Rawashdeh, J., & Abdullah, M. Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th international conference on information and communication systems, 2020: 243-248.

Sridhar, S., & Sanagavarapu, S. Handling data imbalance in predictive maintenance for machines using SMOTE-based oversampling. In 2021 13th International Conference on Computational Intelligence and Communication Networks, 2021: 44-49.

Paul, S. Diving Deep with Imbalanced Data, 2018. Retrieved from https://www.datacamp.com/tutorial/diving-deep-imbalanced-data.

Prasad, P. Y., Chowdarv, A. S., Bavitha, C., Mounisha, E., & Reethika, C. A Comparison Study of Fraud Detection in Usage of Credit Cards using Machine Learning. In 2023 7th International Conference on Trends in Electronics and Informatics, 2023: 1204-1209.

Bhandari, A. Guide to AUC ROC Curve in Machine Learning: What is Specificity? Retrieved from https://www.analyticsvidhya.com/blog/2020/06/auc-roc-curve-machine-learning/.

Zeng, M., Zou, B., Wei, F., Liu, X., & Wang, L. Effective prediction of three common diseases by combining SMOTE with Tomek links technique for imbalanced medical data. In 2016 IEEE International Conference of Online Analysis and Computing Science, 2016, 225-228.

Downloads

Published

12-12-2023

How to Cite

Ren, Y. (2023). Application of Machine Learning Algorithms in Detecting Credit Card Fraud: A Comparative Analysis. Highlights in Business, Economics and Management, 21, 733-739. https://doi.org/10.54097/hbem.v21i.14753