Optimizing Bank Customer Churn Prediction with LightGBM and Data-Driven Strategies

Authors

  • Weixuan Wu

DOI:

https://doi.org/10.54097/wm3yk853

Keywords:

Bank Customer Churn, Prediction, LightGBM.

Abstract

This study utilizes the LightGBM model to enhance the prediction of bank customer churn. By utilizing Kaggle's comprehensive datasets and feature engineering, to solve the missing value problem, and eliminate the useless data, so that the data becomes unified, centralized, and easy to identify the advantages. Used data visualization for in-depth analysis, an intervention to further narrow down the content and characteristics of the data by manually identifying correlations and characteristics of various aspects of the data and then conducting more precise checks. The use of LightGBM takes full advantage of handling massive datasets, outperforming traditional algorithms such as Random Forest and XGBoost in terms of efficiency and speed. The combination of new features such as age category and account balance further improves the prediction accuracy of the model, and more deeply complete, In conclusion, this study takes an important step in applying machine learning to improve bank customer churn prediction by proposing a model that balances complexity and practicality.

Downloads

Download data is not yet available.

References

Jordan, M. I., Mitchell, T. M.: Machine learning: trends, perspectives, and prospects. Science, 349 (6245), 255–260 (2020).

Radhakrishnan, A.: Mechanism for feature learning in neural networks and backpropagation-free machine learning models. Science (2024).

Bensaoud, A.: A survey of malware detection using deep learning. Machine Learning with Applications, 16(1):100546 (2024).

Wardley, L. J.: A machine learning approach feature to forecast the future performance of the universities in Canada. Machine Learning with Applications, 100548 (2024).

Ahmed, J., Robert G.: Predicting severely imbalanced data disk drive failures with machine learning models. Machine Learning with Applications, 100361 (2021).

Solaiyappan, S., Yuxin W.: Machine learning based medical image deepfake detection: A comparative study. Machine Learning with Applications, 100298, (2022).

Jahangiri, S.: An inpatient fall risk assessment tool: Application of machine learning models on intrinsic and extrinsic risk factors. Machine Learning with Applications, 15(1): 100519 (2024).

Subasi, A.: Practical machine learning for data analysis using Python. Academic Press (2020).

Emmanuel, T., Maupong, T., Mpoeleng, D. et al.: A survey on missing data in machine learning. Journal of Big Data, 8, 1-37 (2021).

Fernando, M. P., Cèsar, F., David, N., et al.: Missing the missing values: The ugly duckling of fairness in machine learning. International Journal of Intelligent Systems, 36(7), 3217-3258 (2021).

Wang, H., Yuan, Z., Chen, Y., et al.: An industrial missing values processing method based on generating model. Computer Networks, 158, 61-68 (2019).

Sinar, E. F. Data visualization. In Big Data at Work, pp. 115-157. Routledge (2015).

Gal, M. S., Rubinfeld, D. L.: Data standardization. Nyul Rev., 94, 737 (2019).

Ke, G., Meng, Q., Finley, T., et al.: Lightgbm A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30 (2017).

Wang, D., Zhang, Y., Zhao, Y.: LightGBM: an effective miRNA classification method in breast cancer patients. In: Proceedings of the 2017 International Conference on Computational Biology and Bioinformatics, pp. 7-11 (2017).

Downloads

Published

17-07-2024

How to Cite

Wu, W. (2024). Optimizing Bank Customer Churn Prediction with LightGBM and Data-Driven Strategies. Highlights in Business, Economics and Management, 36, 477-483. https://doi.org/10.54097/wm3yk853