Performance Analysis and Comparison of Heart Disease Prediction Models

Authors

  • Jiayi Jin
  • Chengyun Zhao

DOI:

https://doi.org/10.54097/yb7t2031

Keywords:

Heart Disease Prediction, Cardiovascular diseases, Lasso Regression.

Abstract

Since cardiovascular diseases (CVDs) are the leading cause of death, it is significant for people to detect them early and take certain precautions. As a result, the paper performs the study of heart disease prediction with statistical models. In this study, the researcher analyzed a dataset from four different regions, including Cleveland, Hungarian, Switzerland, and Long Beach. Each of the regions is used as one testing dataset and the rest of the three regions are used as one training dataset, with a total of four sets of training and testing datasets. The goal is to predict heart disease with various variables using three main statistical techniques, lasso regression, random forest, and logistic regression (with feature selection). The paper will compare aspects such as accuracy, confusion matrix, and roc-auc for the models created. The results show that the models tested on the Hungarian dataset performed best overall, while the random forest algorithm achieved the highest prediction accuracy.

Downloads

Download data is not yet available.

References

[1] Fetal origins of coronary heart disease. (n.d.). The BMJ. Retrieved September 14, 2024, from https://www.bmj.com/content/early/recent.

[2] Mark, D. B., Shaw, L., Harrell, F. E., Hlatky, M. A., Lee, K. L., Bengtson, J. R., McCants, C. B., Califf, R. M., & Pryor, D. B. Prognostic value of a treadmill exercise score in outpatients with suspected coronary artery disease. New England Journal of Medicine, 1991, 325 (12), 849 – 853.

[3] Ahsan, M. M., & Siddique, Z. Machine learning-based heart disease diagnosis: A systematic literature review. Artificial Intelligence in Medicine, 2022, 128, 102289.

[4] Ulloa-Cerna, A. E., Jing, L., Pfeifer, J. M., Raghunath, S., Ruhl, J. A., Rocha, D. B., Leader, J. B., Zimmerman, N., Lee, G., Steinhubl, S. R., Good, C. W., Haggerty, C. M., & Fornwalt, B. K. Recommend: An ECG-based machine learning approach for identifying patients at increased risk of undiagnosed structural heart disease detectable by echocardiography. Circulation, 2022, 146 (1), 36 – 47.

[5] Yaghoobi V, Martinez-Morilla S, Liu Y, Charette L, Rimm DL, Harigopal M. Advances in quantitative immunohistochemistry and their contribution to breast cancer. Expert Rev Mol Diagn. 2020, 20 (5): 509 – 22.

[6] Wang F, Wang Y, Ji X, Wang Z. Effective macrosomia prediction using random forest algorithm. Int J Environ Res Public Health. 2022.

[7] Zhang, Z., Zhao, Y., Canes, A., Steinberg, D., & Lyashevska, O. Predictive analytics with gradient boosting in clinical medicine. Big-data Clinical Trial Column, 2019, 7 (7).

[8] Rouzrokh, P., Khosravi, B., Faghani, S., Moassefi, M., Vera Garcia, D. V., Singh, Y., Zhang, K., Conte, G. M., & Erickson, B. J. Mitigating bias in radiology machine learning: 1. Data handling. Radiology: Artificial Intelligence, 2018.

[9] Doshi-Velez, F., & Kim, B. Considerations for evaluation and generalization in interpretable machine learning. In H. J. Escalante, S. Escalera, I. Guyon, et al. (Eds.), Explainable and interpretable models in computer vision and machine learning (pp. 3 – 17). Cham, Switzerland: Springer, 2018.

[10] Yang, R., Huang, T., Wang, Z., Huang, W., Feng, A., Li, L., & Lyu, J. Deep-learning-based survival prediction of patients in coronary care units. Journal Name, Article 5745304, 2021.

Downloads

Published

24-12-2024

How to Cite

Jin, J., & Zhao, C. (2024). Performance Analysis and Comparison of Heart Disease Prediction Models. Highlights in Science, Engineering and Technology, 123, 618-624. https://doi.org/10.54097/yb7t2031