Coronary Heart Disease Prediction from Common Risk Factors

Authors

  • Jianing Song

DOI:

https://doi.org/10.54097/bbazcv71

Keywords:

Logistic regression; K-nearest neighbor; prediction model; coronary heart disease.

Abstract

Coronary heart disease (CHD) is the most prevalent type of heart diseases, and is one of the most common chronic diseases leading to death. This study analyzed the data collected from coronary heart disease patients and developed a predictive model for coronary heart disease diagnosis. The dataset from Kaggle website contains 462 observations, one target variable and 9 common risk factors, such as systolic blood pressure (sbl), the use of tobacco each year, low density lipoprotein (ldl), body adiposity index (BAI), family history, the score of type A personality, body mass index (BMI), daily use of alcoho and age. At first, the data prepossessing indicates that there are no missing value and extreme distribution, and the correlation and collinearity between the variables are weak. Then the study develops two prediction models using the methods such as k-nearest neighbor (KNN) and logistic regression. To choose the most suitable one, the indicators like kappa value and accuracy are the main factors to evaluate the model’s performance. The results shows that the logistic regression model have the higher accuracy, which is equal to 0.7174. After dropping some variables which have little impact on final prediction, the outputs present that yearly use of tobacco, low density lipoprotein, family history and BMI are the main factors which may influence the diagnosis of CHD. In the end, more clinical patients’ data still needs to be collected for improving the model’s accuracy.

Downloads

Download data is not yet available.

References

Tsao C W, Aday A W, Almarzooq Z I, Beaton A Z, Bittencourt M S, Boehme A K, et al. Heart Disease and Stroke Statistics-2023 Update: A Report From the American Heart Association. Circulation. 2023, 147.

National Center for Health Statistics. Percentage of coronary heart disease for adults aged 18 and over, United States, 2019—2021. National Health Interview Survey, 2023.

Visseren F L J, et al. 2021 ESC guidelines on cardiovascular disease prevention in clinical practice. Eur Heart J, 2021, 42(34): 3227-3337.

Nasa Sinnott-Armstrong, Y. Tanigawa, et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat Genet, 2021, 53(2): 185-194.

Johnston N, Jernberg T, Lagerqvist B, Siegbahn A, Wallentin L. Improved identification of patients with coronary artery disease by the use of new lipid and lipoprotein biomarkers. Am J Cardiol, 2006, 97(5): 640-645.

Zakynthinos Z, Pappa N. Inflammatory biomarkers in coronary artery disease. J Cardiol, 2009, 53(3): 317-333.

Wilson P, Castelli W P, Kannel W B. Coronary risk prediction in adults. Am J Cardiol. 1987, 59(14): 91-94.

Amine E K, Samy M. Obesity among Female University Students in United Arab Emirates. The Journal of the Royal Society for the Prevention of Health, 1996, 116: 91-96.

Khader Y S, Alsadi A A. Smoking Habits among University Students in Jordon: Prevalence and Asso-ciated Factors. Eastern Mediterranean Health Journal, 2008, 14: 897-904.

Dodani S, Mistry R, Farooqi M, Khwaja A, Qureshi R. Prevalence and Awareness of Risk Factors and Behaviors of Coronay Heart Disease in an Urban Population of Karachi, the Largest City of Pakistan: A Community Survey. Journal of Public Health, 2004, 26: 245-249.

Downloads

Published

18-06-2024

How to Cite

Song, J. (2024). Coronary Heart Disease Prediction from Common Risk Factors. Highlights in Science, Engineering and Technology, 99, 21-27. https://doi.org/10.54097/bbazcv71