Machine Learning Approaches to Predicting Depression in University Students: A Comparative Analysis of Logistic Regression, LASSO, and Random Forest

Authors

  • Xinyu Wang

DOI:

https://doi.org/10.54097/hadyjm88

Keywords:

Depression, Machine Learning, Risk Factors, Predictive Models, Data Visualization.

Abstract

Depression is a growing mental health concern among university students and can harm academic performance and well-being. This study analyzes data from 27,901 university students to explore key risk factors and build models to predict depression. We focus on variables such as academic pressure, financial stress, sleep duration, study satisfaction, family history of mental illness, age, gender, and grade point average. Three machine-learning methods—logistic regression, LASSO, and random forest—were applied and compared. The results show that academic pressure, financial stress, and short sleep are the strongest predictors of depression, with family history and low study satisfaction also playing important roles, while age, gender, and grades had smaller effects. All three models performed strongly, with area-under-the-curve (AUC) values around 0.86 and showed good calibration. Logistic regression and LASSO achieved nearly the same accuracy as random forest, making them easier to use and explain in real settings. These findings highlight practical steps for schools and health services: programs that lower academic pressure, provide financial help, and improve sleep habits can reduce depression risk. Early identification and timely support may help prevent serious problems. This research shows that simple machine-learning models can guide effective and affordable mental-health strategies for students.

Downloads

Download data is not yet available.

References

[1] Paykel E S. Basic concepts of depression. Dialogues in Clinical Neuroscience, 2008, 10 (3): 279-289.

[2] Khurshid S, Parveen Q, Yousuf M I, Chaudhry A G. Effects of depression on students’ academic performance. Science International, 2015, 27 (2): 1619-1624.

[3] Cahuas A, He Z, Zhang Z, Chen W. Relationship of physical activity and sleep with depression in college students. Journal of American College Health, 2020, 68 (5): 557-564.

[4] Cassady J C, Pierson E E, Starling J M. Predicting student depression with measures of general and academic anxieties. Frontiers in Education, 2019, 4: 11.

[5] Sawangarreerak S, Thanathamathee P. Random Forest with sampling techniques for handling imbalanced prediction of university student depression. Information, 2020, 11 (11): 519.

[6] Liu X Q, Guo Y X, Zhang W J, Gao W J. Influencing factors, prediction and prevention of depression in college students: a literature review. World Journal of Psychiatry, 2022, 12 (7): 860.

[7] Sperandei S. Understanding logistic regression analysis. Biochemia Medica, 2014, 24 (1): 12-18.

[8] Breiman L. Random forests. Machine Learning, 2001, 45 (1): 5-32.

[9] Ranstam J, Cook J A. LASSO regression. Journal of British Surgery, 2018, 105 (10): 1348-1348.

[10] Dinga R, Penninx B W J H, Veltman D J, Schmaal L, Marquand A F. Beyond accuracy: measures for assessing machine learning models, pitfalls and guidelines. bioRxiv, 2019: 743138.

Downloads

Published

10-02-2026

Issue

Section

Articles

How to Cite

Wang, X. (2026). Machine Learning Approaches to Predicting Depression in University Students: A Comparative Analysis of Logistic Regression, LASSO, and Random Forest. International Journal of Biology and Life Sciences, 13(2), 199-205. https://doi.org/10.54097/hadyjm88