Machine Learning Based Customer Churn Prediction in Banking Sector
DOI:
https://doi.org/10.54097/7z0s8b66Keywords:
Bank Churn, Customer Churn Prediction, Machine Learning.Abstract
Customers in 21st century have access to a wide range of ways to deposit money, both online and offline, which leads to constant customer churn for the whole banking industry. In order to retain existing customers, the bank sector has been prioritizing building models which aim to predict clients who may exit in the future. In this paper, based on machine learning techniques, different models such as XGboost, Catboost and LightGBM are fitted to the churn modelling dataset from Kaggle, contributing to the prediction of potential bank customer churn. In addition, some methods of feature selection and hyperparameter tunning are used to enhance the performance of final prediction results. The results generated by different models are compared in terms of accuracy, precision, recall, etc. Age and the number of products purchased from the bank are suggested to be 2 factors that greatly influence the prediction results. LightGBM model shows the best general performance and therefore is recommended for future prediction.
Downloads
References
Singh PP, Anik FI, Senapati R, Sinha A, Sakib N, Hossain E. Investigating customer churn in banking: A machine learning approach and visualization app for data science and management. Data Science and Management, 2024, 7 (1): 7-16.
Liu DR, Shih YY. Integrating AHP and data mining for product recommendation based on customer lifetime value. Information & Management, 2005, 42 (3): 387-400.
Amoako GK, Arthur E, Bandoh C, Katah RK. The impact of effective customer relationship management (CRM) on repurchase: A case study of (GOLDEN TULIP) hotel (ACCRA-GHANA). African Journal of Marketing Management, 2012, 4 (1): 17-29.
Gallo A. The value of keeping the right customers. Harvard Business Review, 2014, 29 (10): 304-309.
De Caigny A, Coussement K, De Bock KW. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 2018, 269 (2): 760-772.
Kaur I, Kaur J. Customer churn analysis and prediction in banking industry using machine learning. In 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), IEEE, 2020 November: 434-437.
Qiu Y, Chen P, Lin Z, et al. Clustering Analysis for Silent Telecom Customers Based on K-means++. 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). IEEE, 2020, 1: 1023-1027.
Gu Z, Lv J, Wu B, et al. Credit risk assessment of small and micro enterprise based on machine learning. Heliyon, 2024, 10 (5).
Qiu Y, Hui Y, Zhao P, et al. A novel image expression-driven modeling strategy for coke quality prediction in the smart cokemaking process. Energy, 2024: 130866.
Bharathi SV, Pramod D, Raman R. An ensemble model for predicting retail banking churn in the youth segment of customers. Data, 2022, 7 (5): 61.
Domingos E, Ojeme B, Daramola O. Experimental analysis of hyperparameters for deep learning-based churn prediction in the banking sector. Computation, 2021, 9 (3): 34.
Muneer A, Ali RF, Alghamdi A, Taib SM, Almaghthawi A, Ghaleb EA. Predicting customers churning in banking industry: A machine learning approach. Indonesian Journal of Electrical Engineering and Computer Science, 2022, 26 (1): 539-549.
Meshram S. Bank customer churn prediction [Data set]. https://www.kaggle.com/datasets/shubhammeshram579/bank-customer-churn-prediction, 2023.
Zelaya CVG. Towards explaining the effects of data preprocessing on machine learning. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), IEEE, 2019 April: 2086-2090.
Mishra K, Rani R. Churn prediction in telecommunication using machine learning. In 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), IEEE, 2017 August: 2252-2257.
Mahapatra S, Gupta VR, Sahu SS, Panda G. Deep neural network and extreme gradient boosting based Hybrid classifier for improved prediction of Protein-Protein interaction. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021, 19 (1): 155-165.
Zhou F, Pan H, Gao Z, Huang X, Qian G, Zhu Y, Xiao F. Fire prediction based on catboost algorithm. Mathematical Problems in Engineering, 2021, 2021: 1-9.
Yu J, Lu Q, Qin Z, Yu J, Li Y, Qin Y. A Multi-Stage Ensembled-Learning Approach for Signal Classification Based on Deep CNN and LGBM Models. J. Commun., 2022, 17 (1): 30-38.
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.