Predicting Vehicle Insurance Purchasing by Machine Learning Algorithms
DOI:
https://doi.org/10.54097/hbem.v21i.14542Keywords:
Vehicle Insurance, Machine Learning, SMOTE.Abstract
This report intends to address the problem of predicting customers' willingness to be insured. To solve the problem, this report uses SMOTE over-sampling and Near Miss under-sampling to solve the data imbalance, establishes eight basic or ensemble models, including Logistic Regression, Decision Tree, etc., and compares the model strengths and weaknesses by using f1_score as a measure. The results of the models represent that the effects of over-sampling are better than under-sampling, and the results of the ensemble models are overall better than the basic models. The best method is over-sampling combined with Adaptive Boosting or Extreme Gradient Boosting. The highest f1_score among all the results is only 0.4, which means that all the methods mentioned in this report are limited in their ability to solve this problem. The methods for solving data imbalance, the prediction models, and the ensemble algorithms mentioned in the report are of high application value. This report expects the existence of models and methods that can significantly improve the prediction effects of this dataset.
Downloads
References
Zhu, Y. Research on the Influencing Factors of the Demand for Automobile Commercial Insurance Market in China. Zhejiang University, Thesis for master’s degree, 2022.
Vehicle Insurance EDA and Boosting Models. www.kaggle.com/code/yashvi/vehicle-insurance-eda-and-boosting-models. Accessed on 2023/8/4.
Wang, Y. H. Research on machine learning-based network device identification methods. Guangzhou University, Thesis for master’s degree, 2023.
Liu, Y., Suo, L. Research on prediction of claims payment in commercial medical insurance based on machine learning methods: a new perspective introducing health behavior preferences. Journal of Central China Normal University (Humanities and Social Sciences Edition), 2023, 62 (04): 81 - 93.
Li, K. X. Research on machine learning-based cost estimation for office buildings. Shandong Jianzhu University, Thesis for master’s degree, 2023.
Cheng, W., Yuan, D., Xiong, P., et al. Construction and evaluation of water quality index prediction model based on multiple machine learning algorithms. Journal of Environmental Sciences, 2023: 1 - 9.
Wang, Z., He, L. Application of ensemble machine learning in predictive maintenance dataset. China Science and Technology Information, 2023, (14): 112 - 114.
Fan, J. Q. Research on financial fraud detection model based on privacy-preserving machine learning. Chinese Academy of Fiscal Sciences, Thesis for master’s degree, 2023.
Chen, Y. Research on car credit default prediction based on grid search and random forest. Science and Technology and Industry, 2023, 23 (09): 116 - 121.
Credit Fraud, Dealing with Imbalanced Datasets. www.kaggle.com/code/janiobachmann/credit-fraud-dealing-with-imbalanced-datasets. Accessed on 2023/8/4.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






