A Study of a Prediction Framework Incorporating Random Forests and Gradient Boosting Trees
DOI:
https://doi.org/10.54097/cj7d6a34Keywords:
Random Forest Model, Regression Model, Gradient Boosting Trees Model, SHAP Model.Abstract
This study constructs a hybrid prediction framework integrating Random Forest regression and gradient boosting tree. Firstly, the study uses Random Forest regression to mine the nonlinear associations between multidimensional features through self-sampling and feature random selection mechanisms to achieve the classification and regression tasks, and builds in feature importance analysis to assess feature contributions. Second, logistic regression and linear regression are used to provide statistical explanations for binary classification problems and multivariate interactions, respectively, to enhance the interpretability of the model. In addition, this study constructs a gradient boosting tree model and combines it with SHAP value analysis, iterative fitting of residuals to improve prediction accuracy, and hyperparameter tuning to further optimise model performance. This framework enhances the analysis and prediction ability of multivariate system data through multi-model collaboration, demonstrates good stability and generalisation ability, and provides a reusable technical paradigm for similar studies.
Downloads
References
[1] Li Hongda. Research on land cover classification of Sentinel-2 multi-seasonal data based on gradient boosting tree and random forest[D]. Qinghai Normal University, 2021.
[2] Zhou Yunhao, Yang Baojie, Liu Dan, et al. Modelling and simulation of predictive analy-sis of power engineering data based on random forest algorithm [J]. Electronic Design Engi-neering, 2024, 32 (04): 103-106+111.
[3] Guo Yanhao, Do Jie, Xiang Zilin, et al. Evaluation of landslide susceptibility of Wenchuan co-seismic landslide based on gradient boosting decision tree and random forest with optimised negative sample sampling strategy [J]. Geoscience Bulletin, 2024, 43 (03): 251-265.
[4] Zou Hang, Jiang Yunlu. A review of methods for selecting robust variables for high-dimensional linear regression models [J]. Applied Probability Statistics, 2024, 40 (01): 157-181.
[5] Gong Yue, Luo Xiaoqin, Wang Dianhai, et al. Gradient boosting regression tree-based travel time prediction for urban roads [J]. Journal of Zhejiang University (Engineering Edition), 2018, 52 (03): 453-460.
[6] Nie Hu, Wu Xiaoyan. A study of factors influencing depression combining gradient boosting tree algorithm and interpretable machine learning model SHAP [J]. Data Analysis and Knowledge Discovery, 2024, 8 (03): 41-52.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Academic Journal of Science and Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.








