Analysis And Research on Overfitting and Underfitting Issues in Heart Disease Prediction Models

Authors

  • Yuying Hu

DOI:

https://doi.org/10.54097/gzcz2793

Keywords:

Predictive Modeling; Heart Disease; Overfitting; Dimensionality Reduction.

Abstract

Research Background: Cardiovascular diseases remain the leading cause of mortality globally, necessitating the advancement of predictive models that can accurately assess heart disease risk. Factors such as cholesterol levels, smoking habits, and demographic variables add complexity and variability to modeling efforts, often resulting in overfitting or underfitting. These issues compromise the models’ applicability to new, unseen datasets, limiting their utility in clinical settings. The challenge lies not only in integrating diverse health indicators into a cohesive analytical framework but also in managing the intrinsic trade-offs between model complexity and generalizability. Study Focus and Methodology: This study addresses the pivotal challenge of overfitting and underfitting in predictive models for heart disease using a comprehensive dataset sourced from Kaggle. By employing a variety of modeling techniques, including logistic regression, random forest, K-nearest neighbors, and decision tree classifiers, this research evaluates how different models assimilate and predict based on the multifaceted data related to heart disease. Through the application of principal component analysis (PCA), this study effectively reduces dimensionality, thereby simplifying the models without sacrificing the integrity of the information. Furthermore, rigorous cross-validation methods are utilized to ensure the models maintain their accuracy and generalizability when applied to new data.

Downloads

Download data is not yet available.

References

[1] Abdul Salam M., Azar A. T., Elgendy M. S., et al. The effect of different dimensionality reduction techniques on machine learning overfitting problem. International Journal of Advanced Computer Science and Applications, 2021, 12(4): 1-8.

[2] Bharti R., Khamparia A., Shabaz M., et al. Prediction of heart disease using a combination of machine learning and deep learning. Computational Intelligence and Neuroscience, 2021, Article ID 8387680, 11 pages.

[3] Zhu X., Huang Y., Wang X., et al. Emotion recognition based on brain-like multimodal hierarchical perception. Multimed. Tools Appl., 2024, 83(18): 56039-56057.

[4] Srinivas K., Rani B. K., Govrdhan A. Applications of data mining techniques in healthcare and prediction of heart attacks. International Journal on Computer Science and Engineering, 2010, 2(2): 250-255.

[5] Wang R., Zhu J., Wang S., Wang T., Huang J., Zhu X. Multi-modal emotion recognition using tensor decomposition fusion and self-supervised multi-tasking. International Journal of Multimedia Information Retrieval, 2024, 13(4): 39.

[6] Acharya A. Comparative study of machine learning algorithms for heart disease prediction. 2017.

[7] Yewale D., Vijayaragavan S. P., Bairagi V. K. An effective heart disease prediction framework based on ensemble techniques in machine learning. International Journal of Advanced Computer Science and Applications, 2023, 14(2).

[8] Ramprakash P., Sarumathi R., Mowriya R., et al. Heart disease prediction using deep neural network[C]//2020 international conference on inventive computation technologies (ICICT). IEEE, 2020: 666-670.

[9] Zhu, X., Guo, C., Feng, H., Huang, Y., Feng, Y., Wang, X., & Wang, R. (2024). A Review of Key Technologies for Emotion Analysis Using Multimodal Information. Cognitive Computation, 1-27.

[10] Ufumaka I. Comparative analysis of machine learning algorithms for heart disease prediction. Int. J. Sci. Res., 2021, 11: 339-346.

Downloads

Published

24-12-2024

How to Cite

Hu, Y. (2024). Analysis And Research on Overfitting and Underfitting Issues in Heart Disease Prediction Models. Highlights in Science, Engineering and Technology, 123, 116-123. https://doi.org/10.54097/gzcz2793