Stock Return Prediction Based on Random Forest
DOI:
https://doi.org/10.54097/jxyat874Keywords:
Random Forest, Quantitative Investment, Return Prediction, Machine Learning, Feature Engineering.Abstract
With the deepening application of machine learning technology in the financial sector, using algorithms to capture non-linear market patterns has become a significant direction in quantitative investment. This study aims to utilize the Random Forest algorithm from ensemble learning to predict the short-term returns of representative stocks across multiple markets. The research acquired daily trading data for the past five years for targets such as Kweichow Moutai, Contemporary Amperex Technology (CATL), and Apple Inc. (AAPL) from Yahoo Finance (yfinance) and AkShare. Feature variables comprising technical indicators and volatility factors were constructed, with the subsequent five-trading-day return as the prediction target. Using Python's Pandas library for data cleaning and feature engineering, the Random Forest model was built based on the Scikit-learn library and evaluated using metrics such as MSE, MAE, and Directional Accuracy. Backtesting and visual analysis results indicate that the Random Forest model possesses a certain level of feasibility and application potential in short-term return prediction, offering investors a data-driven and objective decision-support tool. This study also discusses the model's limitations and proposes directions for future improvement.
Downloads
References
[1] Cakici, N., Fieberg, C., Osorio, C., Poddig, T., & Zaremba, A. (2025). Picking Winners in Factorland: A Machine Learning Approach to Predicting Factor Returns. Morningstar.
[2] Cao, Z., Ji, H., & Xie, B. (2014). Research on Stock Classification Selection Based on Random Forest. Statistics and Information Forum, 29(2), 35-38.
[3] Chen, Y., Liu, B., & Li, C. (2019). Research on Quantitative Stock Selection Strategy Based on Random Forest. Journal of Applied Statistics and Management, 38(3), 556-567.
[4] Iroko, T., Alagbada, A., & Tchoneteck, S. (2025). Hybrid HMM–Gradient Boosting Signals for Short-Horizon Equity Returns and Prices. IEEE Dataport.
[5] Kaczmarek, T., & Zaremba, A. (2025). Beyond the last surprise: Reviving PEAD with machine learning and historical earnings. Finance Research Letters, 108751.
[6] Li, D., & Liang, B. (2018). A Review of Machine Learning Applications in Quantitative Investment. Financial Theory and Practice, (10), 98-104.
[7] Li, K., Li, Y., Yu, J., & Lyu, C. (2025). How to Dominate the Historical Average. The Review of Financial Studies. Advance online publication.
[8] Shen, C., & Xu, S. (2025). Application of Random Forest and XGBoost in Stock Price Prediction. Computer Science and Application, 15(10), 10-17.
[9] Wu, W., Tan, H., & Guo, K. (2021). How Does Artificial Intelligence Technology Affect Capital Markets?——A Research Perspective Based on Machine Learning. Economic Research Journal, 56(10), 192-208.
[10] Xu, Y. (2025). A Stock Price Prediction System Based on Multiple Technical Indicators [Master's thesis, National Chung Hsing University]. https://etds.lib.nchu.edu.tw/thesis/detail/14ce66bac8e85a8e3d 0254f4de 9a914d/
[11] Zhang, H. (2025). Random Forest Analysis of Post-Low-Level-High-Volume Returns in China's A-Shares. Operations Research and Fuzziness, 15(4), 382-392.
[12] Zhao, S., & Wang, J. (2020). Stock Prediction Model Based on XGBoost Ensemble Learning Algorithm. Systems Engineering - Theory & Practice, 40(5), 1158-1170.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

