Prediction and Feature Importance Investigation for Bank Churn Based on Machine Learning
DOI:
https://doi.org/10.54097/4jkbx429Keywords:
Bank churn, random forest, decision tree, logistic regression.Abstract
Since bank customers are one of the main sources of bank income, preventing the loss of bank customers has always been the primary and struggle problem for banks. This paper chooses three machine learning methods, random forest, decision tree and logistic regression should focus to predict the leaving customers. In order to more accurately determine the factors affecting the departure of bank customers, this paper grouped the data set according to the age of over and under 40. The results shows that the prediction performance of random forest is the best one in both groups, and the logistic regression is the worst one. The precision of this model is higher in younger group than in older group, the accuracy in each group is about 90% and 76% respectively. Then the random forest method is used to return the important features for two groups. For people older than 40 years old, whether to continue to stay in the bank to buy its products is greatly affected by their Balance and Age factors. Having more balance and being younger, the more possibility to keep purchasing. While for under 40 years old customers, their counterpart behaviors are more determined by the Estimated Salary and Credit Score. Thus, when banks managers tackle customer management, they should focus more on the above factors to better prevent the loss of customers.
Downloads
References
Qiu Y, Wang J, Jin Z, Chen H, Zhang M, Guo L. Pose-guided matching based on deep learning for assessing quality of action on rehabilitation training. Biomedical Signal Processing and Control. 2022 Feb 1; 72: 103323.
Sun G, Zhan T, Owusu BG, Daniel AM, Liu G, Jiang W. Revised reinforcement learning based on anchor graph hashing for autonomous cell activation in cloud-RANs. Future Generation Computer Systems. 2020 Mar 1; 104: 60-73.
Wu Y, Jin Z, Shi C, Liang P, Zhan T. Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis. arXiv preprint arXiv:2403.08217. 2024 Mar 13.
Zhou Y, Osman A, Willms M, Kunz A, Philipp S, Blatt J, Eul S. Semantic Wireframe Detection, 2023.
Wang H, Zhou Y, Perez E, Roemer F. Jointly Learning Selection Matrices for Transmitters, Receivers and Fourier Coefficients in Multichannel Imaging. arXiv preprint arXiv:2402.19023. 2024 Feb 29.
Rahman M, Kumar V. Machine learning based customer churn prediction in banking. 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA). IEEE, 2020.
Vafeiadis T, et al. A comparison of machine learning techniques for customer churn prediction. Simulation Modelling Practice and Theory, 2015, 55: 1-9.
Tran H, Le N, Nguyen V-H. CUSTOMER CHURN PREDICTION IN THE BANKING SECTOR USING MACHINE LEARNING-BASED CLASSIFICATION MODELS. Interdisciplinary Journal of Information, Knowledge & Management, 2023, 18.
Mutanen T. Customer churn analysis–a case study. Journal of Product and Brand Management, 2006, 14 (1): 4-13.
Inkumsah WA. Factors That Impacted Customer Retention of Banks. A Study of Recently Acquired Banks in the UPSA Area of Madina, Accra (Specifically Access Bank). Journal of Marketing and Consumer Research, 2013, 1.88: 103.
Kaggle. Binary Classification with a Bank Churn Dataset. Available at: https://www.kaggle.com/competitions/playground-series-s4e1, 2024.
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.