Machine Learning and Statistical Methods for Predicting Survival of Patients with Heart Failure

Authors

  • Lingjie Jia

DOI:

https://doi.org/10.54097/ay420g44

Keywords:

Machine learning, Biostatistics, Survival.

Abstract

Heart failure is a branch of heart disease, which causes millions of people to die worldwide. This research is an investigation of survival prediction of patients with heart failure. The type of this prediction is a binary classification problem. The machine learning models used in this paper are logistic regression, decision trees, random forests, support vector machines, and artificial neural networks. The methods for evaluating the performance of prediction models are k-fold and stratified k-fold cross-validation. The results of 2 cross-validation indicate that logistic regression has the best performance. In addition, according to the feature ranking method in the literature, it can be observed that the prediction of heart failure mainly relies on serum creatinine, ejection fraction, and follow-up time. The conclusion is that the logistic regression, which only involves features: serum creatinine, ejection fraction, and follow-up time, is well suited for predicting the survival of patients with heart failure.

Downloads

Download data is not yet available.

References

D. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from. serum creatinine and ejection fraction alone,” BMC Medical Informatics and Decision Making, 2020, 20 (1), 16.

“Heart failure clinical records,” UCI Machine Learning Repository, 2020.

V. Grgi ́c, D. Muˇsi ́c, and E. Babovi ́c, “Model for predicting heart failure using Random Forest and Logistic Regression algorithms,” in IOP Conference Series: Materials Science and Engineering, 2021, 1208, 012039.

F. S. Alotaibi, “Implementation of Machine Learning Model to Predict Heart Failure Disease,” International Journal of Advanced Computer Science and Applications, 2019, 10 (6).

D. Mpanya, T. Celik, E. Klug, and H. Ntsinjana, “Machine learning and statistical methods for predicting mortality in heart failure,” Heart Failure Reviews, 2021, 26 (3), 545 – 552.

Z. Zhang, “Model building strategy for logistic regression: purposeful selection,” Annals of translational medicine, 2016, 4 (6).

B. Charbuty and A. Abdulazeez, “Classification based on decision tree algorithm for machine learning,” Journal of Applied Science and Technology Trends, 2021, 2 (01), 20 – 28.

P. Probst and A.-L. Boulesteix, “To tune or not to tune the number of trees in random forest,” The Journal of Machine Learning Research, 2017, 18 (1), 6673 – 6690.

A.-L. Boulesteix, S. Janitza, J. Kruppa, and I. R. K ̈onig, “Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2012, 2 (6), 493 – 507.

T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The elements of statistical learning: data mining, inference, and prediction. Springer, 2009, 2.

D. Berrar et al., “Cross-validation.” 2019.

R. C. Prati, “Combining feature ranking algorithms through rank aggregation,” in the 2012 International Joint Conference on Neural Networks (IJCNN). Brisbane, Australia: IEEE, 2012, 1 – 8.

Downloads

Published

10-04-2024

How to Cite

Jia, L. (2024). Machine Learning and Statistical Methods for Predicting Survival of Patients with Heart Failure. Highlights in Science, Engineering and Technology, 92, 369-375. https://doi.org/10.54097/ay420g44