The Investigation of Prediction for Stroke Using Multiple Machine Learning Models

Jingrui Wen

doi:10.54097/hpwbp429

Authors

Jingrui Wen

DOI:

https://doi.org/10.54097/hpwbp429

Keywords:

Stroke, medical prediction, machine learning.

Abstract

The primary objective of this research is to forecast stroke occurrence on an individual patient level. Through exploratory data analysis, the study has brought to light noteworthy disparities in the distribution of stroke and non-stroke cases, shedding light on the influence of diverse health and lifestyle factors on stroke susceptibility. This project underscores the immense potential of machine learning in the realm of medical prediction, serving to aid patients in risk assessment and aiding medical practitioners in devising treatment strategies. Concerning the predictive models employed, this research leveraged two distinct models, namely RandomForest and DecisionTree. Additionally, it utilized evaluation metrics such as the Confusion Matrix, Receiver Operator Characteristic (ROC) curve, and Precision-Recall curve, each of which provided comprehensive insights into the performance of the prediction models. One noteworthy aspect of this study is the presence of missing data within certain features, underscoring the challenges posed by data gaps in medical prediction and the imperative need for effective methods to handle missing data. The experimental outcomes unveiled an Area Under the Curve (AUC) of 95% for RandomForest and 92% for DecisionTree, indicating robust predictive capabilities. Future endeavors may concentrate on refining prediction models, achieving greater balance, and expanding the dataset to enhance prediction precision.

Downloads

Download data is not yet available.

References

World stroke Organization, 2023, https://www.world-stroke.org/assets/downloads/WSO_Global_Stroke_Fact_Sheet.pdf

Wolfe C D A. The impact of stroke. British medical bulletin, 2000, 56(2): 275-286.

Weimar C, Roth M P, Zillessen G, et al. Complications following acute ischemic stroke. European neurology, 2002, 48(3): 133-140.

Johnson W, Onuma O, Owolabi M, et al. Stroke: a global response is needed. Bulletin of the World Health Organization, 2016, 94(9): 634.

Heaton J B, Polson N G, Witte J H. Deep learning for finance: deep portfolios. Applied Stochastic Models in Business and Industry, 2017, 33(1): 3-12.

Culkin R, Das S R. Machine learning in finance: the case of deep learning for option pricing. Journal of Investment Management, 2017, 15(4): 92-100.

Qiu Y, Wang J, Jin Z, et al. Pose-guided matching based on deep learning for assessing quality of action on rehabilitation training. Biomedical Signal Processing and Control, 2022, 72: 103323.

Monil P, Darshan P, Jecky R, et al. Customer segmentation using machine learning. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2020, 8(6): 2104-2108.

Rigatti S J. Random forest. Journal of Insurance Medicine, 2017, 47(1): 31-39.

Myles A J, Feudale R N, Liu Y, et al. An introduction to decision tree modeling. Journal of Chemometrics: A Journal of the Chemometrics Society, 2004, 18(6): 275-285.