Fraud Detection in Credit Risk Assessment Using Supervised Learning Algorithms
DOI:
https://doi.org/10.54097/qw9j1892Keywords:
Credit Risk Assessment, Fraud Detection, Supervised Learning Algorithms, Data Preprocessing, Feature Engineering.Abstract
Our study systematically evaluates the performance of various supervised learning algorithms in credit risk assessment and fraud detection, including Logistic Regression, Decision Tree, Support Vector Machine, Random Forest, Gradient Boosting Tree, and Neural Network. The results show that in credit risk assessment, the Gradient Boosting Tree performed best with an accuracy of 90.5% and a ROC-AUC of 0.84, followed by Random Forest and Neural Network, with accuracies of 89.2% and 88.8%, and ROC-AUCs of 0.82 and 0.81, respectively. In the fraud detection task, the Neural Network performed best with an accuracy of 97.5% and a ROC-AUC of 0.88, while Gradient Boosting Tree and Random Forest achieved accuracies of 97.1% and 96.3%, and ROC-AUCs of 0.87 and 0.85, respectively. Feature importance analysis indicates that repayment history, credit limit, bill amount, and repayment amount are key features in credit risk assessment, while transaction amount, transaction time, and location are crucial for fraud detection. Data preprocessing and feature engineering played critical roles in enhancing model performance. Further optimization of model hyperparameters and addressing data imbalance issues will help improve model performance. In conclusion, ensemble learning methods and Neural Networks exhibit significant advantages in credit risk assessment and fraud detection. By employing scientific data preprocessing and feature engineering, combined with advanced machine learning algorithms, financial institutions can significantly enhance their risk management effectiveness.
References
Roy, P. K., & Shaw, K. (2021). A multicriteria credit scoring model for SMEs using hybrid BWM and TOPSIS. Financial Innovation, 7(1), 77.
Orsenigo, C., & Vercellis, C. (2013). Linear versus nonlinear dimensionality reduction for banks’ credit rating prediction. Knowledge-Based Systems, 47, 14-22.
Karaa, A., & Krichene, A. (2012). Credit-risk assessment using support vectors machine and multilayer neural network models: a comparative study case of a tunisian bank. Accounting and Management Information Systems, 11(4), 587.
Teles, G., Rodrigues, J. J., Rabelo, R. A., & Kozlov, S. A. (2021). Comparative study of support vector machines and random forests machine learning algorithms on credit operation. Software: Practice and Experience, 51(12), 2492-2500.
Yang, Y., Guo, Z., Gellman, A. J., & Kitchin, J. R. (2022). Simulating segregation in a ternary Cu–Pd–Au alloy with density functional theory, machine learning, and Monte Carlo simulations. The Journal of Physical Chemistry C, 126(4), 1800-1808.
Xu, T. (2024). Comparative Analysis of Machine Learning Algorithms for Consumer Credit Risk Assessment. Transactions on Computer Science and Intelligent Systems Research, 4, 60-67.
Xu, T. (2024). Credit Risk Assessment Using a Combined Approach of Supervised and Unsupervised Learning. Journal of Computational Methods in Engineering Applications, 1-12.
Bhasin, M. L. (2016). The fight against bank frauds: Current scenario and future challenges. Ciencia e Tecnica Vitivinicola Journal, 31(2), 56-85.
Zhang, Y., Yang, K., Wang, Y., Yang, P., & Liu, X. (2023, July). Speculative ECC and LCIM Enabled NUMA Device Core. In 2023 3rd International Symposium on Computer Technology and Information Science (ISCTIS) (pp. 624-631). IEEE.
Bhatore, S., Mohan, L., & Reddy, Y. R. (2020). Machine learning techniques for credit risk evaluation: a systematic literature review. Journal of Banking and Financial Technology, 4(1), 111-138.
Xia, Y., Liu, S., Yu, Q., Deng, L., Zhang, Y., Su, H., & Zheng, K. (2023). Parameterized Decision-making with Multi-modal Perception for Autonomous Driving. arXiv preprint arXiv:2312.11935.
Hilal, W., Gadsden, S. A., & Yawney, J. (2022). Financial fraud: a review of anomaly detection techniques and recent advances. Expert systems With applications, 193, 116429.
Qiu, L., & Liu, M. (2024). Innovative Design of Cultural Souvenirs Based on Deep Learning and CAD.
Miliūnaitė, L. (2023). Evaluating the credit risk of SMEs using artificial intelligence, financial and alternative data (Doctoral dissertation, Kauno technologijos universitetas).
Lin, Y. Discussion on the Development of Artificial Intelligence by Computer Information Technology.
Smitha, T., & Sundaram, V. (2012). Comparative study of data mining algorithms for high dimensional data analysis. International Journal of Advances in Engineering & Technology, 4(2), 173.
Liu, M., & Li, Y. (2023, October). Numerical analysis and calculation of urban landscape spatial pattern. In 2nd International Conference on Intelligent Design and Innovative Technology (ICIDIT 2023) (pp. 113-119). Atlantis Press.
Bharti, J. P., Mishra, P., moorthy, U., Sathishkumar, V. E., Cho, Y., & Samui, P. (2021). Slope stability analysis using Rf, gbm, cart, bt and xgboost. Geotechnical and Geological Engineering, 39, 3741-3752.
Lin, Y. (2024). Application and Challenges of Computer Networks in Distance Education. Computing, Performance and Communication Systems, 8(1), 17-24.
Lin, Y. (2024). Design of urban road fault detection system based on artificial neural network and deep learning. Frontiers in neuroscience, 18, 1369832.
Yang, Y., Guo, Z., Gellman, A. J., & Kitchin, J. (2022, November). Modeling Ternary Alloy Segregation with Density Functional Theory and Machine Learning. In 2022 AIChE Annual Meeting. AIChE.
Yang, Y., Liu, M., & Kitchin, J. R. (2022). Neural network embeddings based similarity search method for atomistic systems. Digital Discovery, 1(5), 636-644.
Yao, Y. (2024). Application of Artificial Intelligence in Smart Cities: Current Status, Challenges and Future Trends. International Journal of Computer Science and Information Technology, 2(2), 324-333.
Yao, Y. (2024). Digital Government Information Platform Construction: Technology, Challenges and Prospects. International Journal of Social Sciences and Public Administration, 2(3), 48-56.
Chen, Y., Jiang, H., Li, C., Jia, X., & Ghamisi, P. (2016). Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE transactions on geoscience and remote sensing, 54(10), 6232-6251.
Yao, Y. (2022). A Review of the Comprehensive Application of Big Data, Artificial Intelligence, and Internet of Things Technologies in Smart Cities. Journal of Computational Methods in Engineering Applications, 1-10.
Liu, Q., Liu, Z., Zhang, H., Chen, Y., & Zhu, J. (2021, October). Mining cross features for financial credit risk assessment. In Proceedings of the 30th ACM international conference on information & knowledge management (pp. 1069-1078).
Yang, Y., Achar, S. K., & Kitchin, J. R. (2022). Evaluation of the degree of rate control via automatic differentiation. AIChE Journal, 68(6), e17653.
Yang, Y., Jiménez-Negrón, O. A., & Kitchin, J. R. (2021). Machine-learning accelerated geometry optimization in molecular simulation. The Journal of Chemical Physics, 154(23).
Lin, Y. (2023). Optimization and Use of Cloud Computing in Big Data Science. Computing, Performance and Communication Systems, 7(1), 119-124.
Yang, J. (2024). Data-Driven Investment Strategies in International Real Estate Markets: A Predictive Analytics Approach. International Journal of Computer Science and Information Technology, 3(1), 247-258.
Yang, J. (2024). Comparative Analysis of the Impact of Advanced Information Technologies on the International Real Estate Market. Transactions on Economics, Business and Management Research, 7, 102-108.
Lin, Y. (2023). Construction of Computer Network Security System in the Era of Big Data. Advances in Computer and Communication, 4(3).
Yang, J. (2024). Application of Business Information Management in Cross-border Real Estate Project Management. International Journal of Social Sciences and Public Administration, 3(2), 204-213.
Wang, C., Yang, H., Chen, Y., Sun, L., Wang, H., & Zhou, Y. (2012). Identification of Image-spam Based on Perimetric Complexity Analysis and SIFT Image Matching Algorithm. JOURNAL OF INFORMATION &COMPUTATIONAL SCIENCE, 9(4), 1073-1081.
Tu, H., Shi, Y., & Xu, M. (2023, May). Integrating conditional shape embedding with generative adversarial network-to assess raster format architectural sketch. In 2023 Annual Modeling and Simulation Conference (ANNSIM) (pp. 560-571). IEEE.
Dong, G., & Liu, H. (Eds.). (2018). Feature engineering for machine learning and data analytics. CRC press.
Wang, C., Yang, H., Chen, Y., Sun, L., Zhou, Y., & Wang, H. (2010). Identification of Image-spam Based on SIFT Image Matching Algorithm. JOURNAL OF INFORMATION &COMPUTATIONAL SCIENCE, 7(14), 3153-3160.
Shi, Y., Ma, C., Wang, C., Wu, T., & Jiang, X. (2024, May). Harmonizing Emotions: An AI-Driven Sound Therapy System Design for Enhancing Mental Health of Older Adults. In International Conference on Human-Computer Interaction (pp. 439-455). Cham: Springer Nature Switzerland.
Alamri, M., & Ykhlef, M. (2022). Survey of credit card anomaly and fraud detection using sampling techniques. Electronics, 11(23), 4003.
Yang, Q., Hu, X., Cheng, Z., Miao, K., & Zheng, X. (2015). Based big data analysis of fraud detection for online transaction orders. In Cloud Computing: 5th International Conference, CloudComp 2014, Guilin, China, October 19-21, 2014, Revised Selected Papers 5 (pp. 98-106). Springer International Publishing.
Zhong, Y., Liu, Y., Gao, E., Wei, C., Wang, Z., & Yan, C. (2024). Deep Learning Solutions for Pneumonia Detection: Performance Comparison of Custom and Transfer Learning Models. medRxiv, 2024-06.
Soana, V., Shi, Y., & Lin, T. A Mobile, Shape-Changing Architectural System: Robotically-Actuated Bending-Active Tensile Hybrid Modules.
Lian, J., & Chen, T. (2024). Research on Complex Data Mining Analysis and Pattern Recognition Based on Deep Learning. Journal of Computing and Electronic Information Management, 12(3), 37-41.
Chen, T., Lian, J., & Sun, B. (2024). An Exploration of the Development of Computerized Data Mining Techniques and Their Application. International Journal of Computer Science and Information Technology, 3(1), 206-212.
Chen, N., Ribeiro, B., & Chen, A. (2016). Financial credit risk assessment: a recent review. Artificial Intelligence Review, 45, 1-23.
An, L., Song, C., Zhang, Q., & Wei, X. (2024). Methods for assessing spillover effects between concurrent green initiatives. MethodsX, 12, 102672.
Shih, H. C., Wei, X., An, L., Weeks, J., & Stow, D. (2024). Urban and Rural BMI Trajectories in Southeastern Ghana: A Space-Time Modeling Perspective on Spatial Autocorrelation. International Journal of Geospatial and Environmental Research, 11(1), 3.
Downloads
Published
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.