The Prediction of Influenza Using the Hybrid ARIMA-LSTM Model

Authors

  • Yi Zou

DOI:

https://doi.org/10.54097/vgfxm178

Keywords:

Time Series, ARIMA, Artificial Neural Network, LSTM, Influenza, Prediction

Abstract

Time series analysis plays an important role in many fields. Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) methods are the most common tools to forecast sequential and time series data. However, both of them have obvious defects. The ARIMA model cannot capture non-linear information in sequences. Although the LSTM network is good at learning dynamic variations, it is prone to overfitting and requires a mass of long-term data. In this study, a hybrid technology blending the ARIMA and the LSTM algorithms was utilized to forecast the number of cases of influenza in China. This hybrid method leveraged the advantages of the ARIMA model and the LSTM network. Firstly, the ARIMA model was used to analyze the linear relationship within the time series. Then, the residuals of the ARIMA were taken as the input values to train the LSTM networks. A new hybrid ARIMA-LSTM model obtained, combining a SARIMA (0,1,0) (1,0,0)52 model and a LSTM model with 50 epochs and 32 batch size. This model successfully addressed the previously mentioned issues and enhanced the precision of the predictions. It managed to reduce 4.6% of RMSE, 8.9% of MSE, and 13.9% of MAE. In addition, this new algorithm was found that it didn’t have a high requirement like the individual LSTM model. Since there were not very many observations in the dataset, the performance of the individual LSTM model was not good. However, the integrated model improved this problem and obtained a more precise prediction. Even though the hybrid model had a better performance on prediction, it still has the risk of overfitting the data. The future work will be to improve the hybrid model to decrease this risk by adding more variables and modifying the structure of the LSTM model. Meanwhile, applying this method to another field further proves its feasibility and provides more effective prediction models.

References

[1]Box, G.E. and Tiao, G.C. Intervention Analysis with Applications to Economic and Environmental Problems. Journal of the American Statistical Association, 1975, 70: 70-79.

[2]Alan Lapedes and Robert Farber. How neural nets work. In Proceedings of the 1987 International Conference on Neural Information Processing Systems (NIPS'87), 1987, pp. 442–456. Cambridge, MA, USA : MIT Press

[3]Kanad Chakraborty, Kishan Mehrotra, Chilukuri K. Mohan, Sanjay Ranka. Forecasting the behavior of multivariate time series using neural networks. Neural Networks, 1992, 5:961-970.

[4]Setyo Tri Wahyudi. The ARIMA Model for the Indonesia Stock Price. Wahyudi2017TheAM. 2017.

[5]Emmanuel Dave, Albert Leonardo, Marethia Jeanice, Novita Hanafiah. Forecasting Indonesia Exports using a Hybrid Model ARIMA-LSTM [J]. Procedia Computer Science, 2021, 179: 480-484.

[6]Wang YW, Shen ZZ, Jiang Y. Comparison of ARIMA and GM (1,1) models for prediction of hepatitis B in China [J]. PLoS One, 2018, 13(9): e0201987.

[7]Ospina, R.; Gondim, J.A.M.; Leiva, V.; Castro, C. An Overview of Forecast Analysis with ARIMA Models during the COVID-19 Pandemic: Methodology and Case Study in Brazil [J]. Mathematics, 2023, 11: 3069.

[8]Sima Siami-Namini & Akbar Siami Namin. Forecasting Economics and Financial Time Series: ARIMA vs. LSTM [J]. ArXiv.org, 2018, abs/1803.06386.

[9]Ashutosh Kumar Dubey, Abhishek Kumar, Vicente García-Díaz, Arpit Kumar Sharma, Kishan Kanhaiya. Study and analysis of SARIMA and LSTM in forecasting time series data [J]. Sustainable Energy Technologies and Assessments, 2021, 47: 10147.

[10]Wu DCW, Ji L, He K, Tso KFG. Forecasting Tourist Daily Arrivals With A Hybrid SARIMA–LSTM Approach [J]. Journal of Hospitality & Tourism Research, 2021, 45: 52-67.

[11]Divino JA, McAleer M. Modelling and forecasting daily international mass tourism to Peru [J]. Tourism Management, 2010, 31: 846-854.

[12]Sherstinsky A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network [J]. Physica D: Nonlinear Phenomena, 2020, 404: 132306.

[13]Stanford JL, Vardeman SB. Random Process Models [J]. Methods in Experimental Physics, C. Chatfield, 1994, 28: 63-92.

[14]Kotu V, Deshpande B. Time Series Forecasting [J]. Data Science, 2nd ed., Vijay Kotu, Bala Deshpande, 2019, pp. 395-445.

[15]Tealab A, Hefny H, Badr A. Forecasting of nonlinear time series using ANN [J]. Future Computing and Informatics Journal, 2017, 2: 39-47.

[16]Sahraei A, Chamorro A, Kraft P, Breuer L. Application of Machine Learning Models to Predict Maximum Event Water Fractions in Streamflow [J]. Frontiers in Water, 2021, 3.

[17]Ludwig SA. Comparison of Time Series Approaches Applied to Greenhouse Gas Analysis: ANFIS, RNN, and LSTM [J]. IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019, pp. 1-6, New Orleans, LA, USA.

[18]Selvin S, Vinayakumar R, Gopalakrishnan EA, Menon VK, Soman KP. Stock Price Prediction Using LSTM, RNN and CNN-Sliding Window Model [J]. International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2017, pp. 1643-1647, Udupi, India.

[19]Staudemeyer RC, Rothstein Morris E. Understanding LSTM - A Tutorial Into Long Short-Term Memory Recurrent Neural Networks [J]. ArXiv, 2019, abs/1909.09586.

[20]Kim TK, Park JH. More About the Basic Assumptions of T-Test: Normality and Sample Size [J]. Korean Journal of Anesthesiology, 2019, 72: 331-335.

[21]Ford C. Understanding Robust Standard Errors [J]. UVA Library StatLab, 2020.

[22]Shapiro SS, Wilk MB. An Analysis of Variance Test for Normality (Complete Samples) [J]. Biometrika, 1965, 52: 591–611.

[23]Mushtaq R. Augmented Dickey Fuller Test [J]. Econometrics: Mathematical Methods & Programming eJournal, 2011.

[24]Wikipedia contributors. Augmented Dickey-Fuller Test [J]. Wikipedia, The Free Encyclopedia, 2024.

[25]Razali NM, Yap BW. Power Comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling Tests [J].

[26]Cryer JD, Chan KS. Time Series Analysis with Applications in R (2nd ed.) [J]. New York: Springer, 2008.

[27]Yang ZR, Yang Z. Artificial Neural Networks [J]. Comprehensive Biomedical Physics, Anders Brahme, 2014, pp. 1-17, Elsevier.

[28]Hamami F, Dahlan IA. Univariate Time Series Data Forecasting of Air Pollution Using LSTM Neural Network [J]. International Conference on Advancement in Data Science, E-learning and Information Systems (ICADEIS), 2020, pp. 1-5, Lombok, Indonesia.

[29]Pooniwala N, Sutar R. Forecasting Short-Term Electric Load with a Hybrid of ARIMA Model and LSTM Network [J]. International Conference on Computer Communication and Informatics (ICCCI), 2021, pp. 1-6, Coimbatore, India.

[30]Venna SR, Tavanaei A, Gottumukkala RN, Raghavan VVR, Maida AS, Nichols S. A Novel Data-Driven Model for Real-Time Influenza Forecasting [J]. IEEE Access, 2019, 7: 7691-7700.

[31]Wen X, Li W. Time Series Prediction Based on LSTM-Attention-LSTM Model [J]. IEEE Access, 2023, 11: 48322-48331.

[32]Kandula S, Shaman J. Near-Term Forecasts of Influenza-Like Illness: An Evaluation of Autoregressive Time Series Approaches [J]. Epidemics, 2019, 27: 41-51.

[33]Tsan Y-T, Chen D-Y, Liu P-Y, Kristiani E, Nguyen KLP, Yang C-T. The Prediction of Influenza-Like Illness and Respiratory Disease Using LSTM and ARIMA [J]. International Journal of Environmental Research and Public Health, 2022, 19: 1858.

[34]Al-Qaness MAA, Ewees AA, Fan H, Elaziz MA. Optimized Forecasting Method for Weekly Influenza Confirmed Cases [J]. International Journal of Environmental Research and Public Health, 2020, 17: 3510.

Azad AS, Sokkalingam R, Daud H, Adhikary SK, Khurshid H, Mazlan SN A, Rabbani MBA. Water Level Prediction through Hybrid SARIMA and ANN Models Based on Time Series Analysis: Red Hills Reservoir Case Study [J]. Sustainability, 2022, 14: 1843.

Downloads

Published

26-03-2025

Issue

Section

Articles

How to Cite

Zou, Y. (2025). The Prediction of Influenza Using the Hybrid ARIMA-LSTM Model. Mathematical Modeling and Algorithm Application, 4(2), 12-20. https://doi.org/10.54097/vgfxm178