LightGBM Model for Detecting Fraud in Online Financial Transactions
DOI:
https://doi.org/10.54097/xw0bng93Keywords:
Fraud detection, one-hot encoding, violin plot, LightGBM, softmax loss function.Abstract
With the rapid development of online finance, combined with its significant potential threats, nowadays it has increasingly drawn attention to the issue of fraud in online payment transactions. Due to the large and diverse nature of online payment data and the binary nature of fraud detection, this study implements the machine learning algorithm LightGBM to detect fraudulent activities. Through data exploration and data preprocessing, the data is refined. Leveraging violin plots to observe the distribution of variables in fraudulent and non-fraudulent activities, and then an insignificant variable is removed. Next, the processed data is fed into the LightGBM model. The result yields a fraud prediction accuracy of 99.5%, shows a very high precision. Additionally, the study verifies the algorithm's robustness and generalization ability. Overall, this research holds significant importance, given its findings and implications for addressing fraud in the rapidly evolving landscape of online financial transactions.
Downloads
References
LIU Hualing, CAO Shijie, XU Junyi, CHEN Shanghui. Anti-fraud Research Advances on Digital Credit Payment[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(10): 2300-2324.
C. Wang, C. Wang, H. Zhu and J. Cui, "LAW: Learning Automatic Windows for Online Payment Fraud Detection," in IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 5, pp. 2122-2135, 1 Sept.-Oct. 2021.
Al-Shehari T, Alsowail R A. An insider data leakage detection using one-hot encoding, synthetic minority oversampling and machine learning techniques[J]. Entropy, 2021, 23(10): 1258.
Liu D X, Qiao S J, Zhang Y Q, et al. A survey on data sampling methods in imbalance classification[J]. J Chongqing Univ Technol (NATURAL SCIENCE), 2019, 33: 102-112.
Zhang Jiawei, Guo Linming, and Yang Xiaomei. "Oversampling and Random Forest Improvement Algorithms for Unbalanced Data." Journal of Computer Engineering & Applications 56.11 (2020).
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Neural Information Processing Systems.
Pan, H., Li, Z., Tian, C. et al. The LightGBM-based classification algorithm for Chinese characters speech imagery BCI system. Cogn Neurodyn 17, 373–384 (2023).
Wang, Y.; Wang, T. Application of Improved LightGBM Model in Blood Glucose Prediction. Appl. Sci. 2020, 10, 3227.
Franke, M., & Degen, J. (2023, September 28). The softmax function: Properties, motivation, and interpretation.
M. Dukhan and A. Ablavatski, "Two-Pass Softmax Algorithm," 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, USA, 2020, pp. 386-395.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







