Application of Xgboost Model to Predict Bank Loan

Authors

  • Yihan Fang

DOI:

https://doi.org/10.54097/ja9pva49

Keywords:

XGBoost, Learning Vector Quantization, Recursive Feature Elimination, Bank Loan.

Abstract

Nowadays the bank loan has been a high-frequent behavior in the field of finance. Predicting the deposits paid by the lenders can effectively reduce the risk of credit fraud and avoid the loss of bank. Traditional methods like classification regressons and support vector machines  are difficult to deal with large sets of data and with relatively low accuracy. In this paper, a novel and robust method called XGBoost is applied in the bank loan and achieves a high accuracy. Also, proving the accessibility of XGBoost model in the business analysis field. The data set used in this study comes from a real bank customer database, which contains 16 variables. Regarding the categorical variables, this paper used one-hot encoding to make these variables can be identified by the model. Following by two steps feature selection, firstly used LVQ to filter variables with low importance, then used RFE to further select 11 variables. After that, this paper used these variables to build a XGBoost model to predict the bank loan deposit. Through hyperparameters optimization, this paper achieved a final accuracy of 80.62%, which shows an excellent model performance. In the future, this model can be used in the bank loan decision systems to determine the deposits that paid by lenders.

Downloads

Download data is not yet available.

References

[1] Al-Anqoudi, Y., Al-Hamdani, A., Al-Badawi, M., & Hedjam, R. Using machine learning in business process re-engineering[J]. Big Data and Cognitive Computing, 2021,5(4): 61.

[2] Bharadiya, J. P. Machine learning and AI in business intelligence: Trends and opportunities[J]. International Journal of Computer (IJC), 2023,48(1):123-134.

[3] Alsaleem, M. Y., & Hasoon, S. O. Predicting bank loan risks using machine learning algorithms[J]. AL-Rafidain Journal of Computer Sciences and Mathematics, 2020,14(1): 149-158.

[4] Niazkar, M., Menapace, A., Brentan, B., Piraei, R., Jimenez, D., Dhawan, P., & Righetti, M. Applications of XGBoost in water resources engineering: A systematic literature review (Dec 2018–May 2023) [J]. Environmental Modelling & Software,2024,174: 105971.

[5] Frifra, A., Maanan, M., Maanan, M., & Rhinane, H. Harnessing LSTM and XGBoost algorithms for storm prediction[J]. Scientific Reports, 2024,14(1): 11381.

[6] Khang, A., Gupta, S. K., Dixit, C. K., & Somani, P. Data-driven application of human capital management databases, big data, and data mining[M]. CRC Press,2023: 105-120

[7] Shaffiee Haghshenas, S., Guido, G., & Astarita, V. Predicting Number of Vehicles Involved in Rural Crashes Using Learning Vector Quantization Algorithm[J]. AI, 2024, 5(3): 1095-1110.

[8] Jabeur, S. B., Mefteh-Wali, S., & Viviani, J. L. Forecasting gold price with the XGBoost algorithm and SHAP interaction values[J]. Annals of Operations Research, 2024, 334(1): 679-699.

[9] Chen, T., & Guestrin, C. Xgboost: A scalable tree boosting system[C]. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining,2016, 785-794.

[10] Yuan, Y., Du, J., Luo, J., Zhu, Y., Huang, Q., & Zhang, M. Discrimination of missing data types in metabolomics data based on particle swarm optimization algorithm and XGBoost model[J]. Scientific Reports, 2024,14(1): 152.

[11] Sun, Z., Li, Y., Yang, Y., Su, L., & Xie, S. Splitting tensile strength of basalt fiber reinforced coral aggregate concrete: Optimized XGBoost models and experimental validation[J]. Construction and Building Materials, 2024,416:135133.

Downloads

Published

28-10-2024

How to Cite

Fang, Y. (2024). Application of Xgboost Model to Predict Bank Loan. Highlights in Science, Engineering and Technology, 115, 397-403. https://doi.org/10.54097/ja9pva49