Depression Analysis Based on Machine Learning and Model Visualization

Authors

  • Xurui Zhang

DOI:

https://doi.org/10.54097/azex8k71

Keywords:

Depression analysis, machine learning, mental health diagnosis.

Abstract

Depression is among the most prevalent mental health issues in the world. Traditional diagnostic methods rely on clinical interviews and scoring scales, which are subjective and biased. In this paper, the adopted model is based on a massive dataset with more than 27,901 student samples to conduct experiments, where this study compares three machine learning algorithms, including Random Forest (RF), Support Vector Machine (SVM) and K-Nearest Neighbor (K-NN). The methods on SVM, RF, and K-NN preprocessed, i. e., excluding outliers through the Inter Quartile Range (IQR) method, feature encoding, and Z-score normalization procedure. The experimental results shows that all models achieved good performance, with SVM having the highest accuracy followed by RF then K-NN. This study also analyzed the confusion matrix in detail. It indicates that SVM performs well in positive class recognition, while random forest excels in negative class recognition. The research results indicate that machine learning models can capture the nonlinear relationships in depression data. They can also handle complex interactions between different factors. This is something traditional linear regression methods cannot do well. This study offers important new insight for automatic depression screening. It also contributes to the development of objective diagnostic tools in mental health care.

Downloads

Download data is not yet available.

References

[1] Moussavi S, Chatterji S, Verdes E, et al. Depression, chronic diseases, and decrements in health: results from the World Health Surveys. Lancet, 2007, 370: 851-858.

[2] Andrade L, Caraveo-anduaga J J, Berglund P, et al. The epidemiology of major depressive episodes: results from the International Consortium of Psychiatric Epidemiology (ICPE) Surveys. International Journal of Methods in Psychiatric Research, 2003, 12: 3-21.

[3] Thapar A, et al. Depression in young people. The Lancet, 2022, 400 (10352): 617-631.

[4] Penninx B W, Milaneschi Y, Lamers F, et al. Understanding the somatic consequences of depression: biological mechanisms and the role of depression symptom profile. BMC Medicine, 2013, 11: 129.

[5] Kroenke K, Spitzer R L, Williams J B W. The PHQ-9. Journal of General Internal Medicine, 2001, 16: 606-613.

[6] McIntyre R S, Konarski J Z, Mancini D A, et al. Measuring the severity of depression and remission in primary care: validation of the HAMD-7 scale. CMAJ, 2005, 173 (11): 1327-1334.

[7] Kim S W, Chang M C. The usefulness of machine learning analysis for predicting the presence of depression with the results of the Korea National Health and Nutrition Examination Survey. Annals of Palliative Medicine, 2023, 12 (4): 748-756.

[8] Twenge J M, Hamilton J L. Linear correlation is insufficient as the sole measure of associations: The case of technology use and mental health. Acta Psychologica, 2022, 229: 103696.

[9] Xu T, Zhu G, Han S. Study of depression influencing factors with zero-inflated regression models in a large-scale population survey. BMJ Open, 2017, 7: e016471.

[10] Shamim A. Student Depression Dataset. Kaggle, 2024.

[11] Salman H A, Kalakech A, Steiti A. Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, 2024, 2024: 69-79.

[12] Suthaharan S. Support Vector Machine. Machine Learning Models and Algorithms for Big Data Classification. Integrated Series in Information Systems, 2016, 36.

[13] Zhang Z. Introduction to machine learning: k-nearest neighbors. Annals of Translational Medicine, 2016, 4 (11): 218.

Downloads

Published

29-01-2026

Issue

Section

Articles

How to Cite

Zhang, X. (2026). Depression Analysis Based on Machine Learning and Model Visualization. Academic Journal of Science and Technology, 19(2), 345-350. https://doi.org/10.54097/azex8k71