Comparative Analysis of Predicting Diabetes in Senior Age Group Based on Machine Learning Models
DOI:
https://doi.org/10.54097/22tepd66Keywords:
Logistic regression, Random Forest, KNN.Abstract
The study investigated the contributing factors to risk for diabetes among older populations and their impact on the onset of diabetes. Analyzing a dataset from Kaggle comprising 18, 100 older adults, the study revealed significant associations between diabetes and age, Body Mass Index (BMI), HbA1c levels, blood sugar levels, high blood pressure, heart disease, and smoking history. Comparison of logistic regression, random forest, and K-nearest neighbor (KNN) prediction models demonstrated that the random forest model exhibited superior performance, with a ROC-AUC value of 0.88, an out-of-pocket error rate of 7.58%, a sensitivity of 99%, and a specificity of 92.3%. This outperformed both the logistic regression (ROC-AUC value 0.85) and KNN (ROC-AUC value 0.80) models. The study indicated that the random forest model is advantageous for processing nonlinear data and multi-variable interactions, making it suitable for predicting diabetes risk in the elderly. Future research could incorporate lifestyle factors to enhance prediction accuracy.
Downloads
References
[1] Quan Z, Kaiyang Q, Yamei L, et al. Predicting diabetes mellitus with machine learning techniques, Frontiers in Genetics, 2018, 9, ISSN 1664 - 8021.
[2] Khan F A, Zeb K, Al-Rakhami M, et al. Detection and prediction of diabetes using data mining: a comprehensive review. IEEE Access, 2021, 9, 43711 - 43735.
[3] Travis F, Dean T E, Scot H S, et al. Limited effectiveness of diabetes risk assessment tools in seniors’ facility residents. Value in Health, 2017, 20 (3), 2017, 329 - 335.
[4] Probst P, Wright MN, Boulesteix A-L. Hyperparameters and tuning strategies for random forest. WIREs Data Mining Knowl Discov. 2019, 9: e1301.
[5] Huiyang Z. Application of Cross-Validation in Model Comparison. Advances in Applied Mathematics. Halder R K, Uddin M N, Uddin, 2023. 12, 1866 – 1873.
[6] M A, et al. Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. J Big Data, 2024, 11, 113.
[7] Talebi Moghaddam M, Jahani Y, Arefzadeh Z, et al. Predicting diabetes in adults: identifying important features in unbalanced data over a 5-year cohort study using machine learning algorithm. BMC Med Res Methodol, 2024, 24, 220.
[8] Fregoso-Aparicio L, Noguez J, Montesinos L, et al. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetol Metab Syndr. 2021, 13 (1): 148.
Published
Issue
Section
License
Copyright (c) 2024 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







