Identifying most important predictors for suicidal thoughts and behaviours among healthcare workers active during the Spain COVID-19 pandemic: a machine-learning approach

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Alayo, Itxaso
  • dc.contributor.author Alonso Caballero, Jordi
  • dc.contributor.author Ferrer Forés, Maria Montserrat
  • dc.contributor.author Amigo, Franco
  • dc.contributor.author Portillo-Van Diest, Ana
  • dc.contributor.author Sanz, Ferran
  • dc.contributor.author Serra, Consol
  • dc.contributor.author Pérez Solá, Victor
  • dc.contributor.author Mortier, Philippe
  • dc.contributor.author Vilagut Saiz, Gemma, 1975-
  • dc.contributor.author MINDCOVID Working group
  • dc.date.accessioned 2025-06-13T07:45:42Z
  • dc.date.available 2025-06-13T07:45:42Z
  • dc.date.issued 2025
  • dc.description.abstract Aims: Studies conducted during the COVID-19 pandemic found high occurrence of suicidal thoughts and behaviours (STBs) among healthcare workers (HCWs). The current study aimed to (1) develop a machine learning-based prediction model for future STBs using data from a large prospective cohort of Spanish HCWs and (2) identify the most important variables in terms of contribution to the model's predictive accuracy. Methods: This is a prospective, multicentre cohort study of Spanish HCWs active during the COVID-19 pandemic. A total of 8,996 HCWs participated in the web-based baseline survey (May-July 2020) and 4,809 in the 4-month follow-up survey. A total of 219 predictor variables were derived from the baseline survey. The outcome variable was any STB at the 4-month follow-up. Variable selection was done using an L1 regularized linear Support Vector Classifier (SVC). A random forest model with 5-fold cross-validation was developed, in which the Synthetic Minority Oversampling Technique (SMOTE) and undersampling of the majority class balancing techniques were tested. The model was evaluated by the area under the Receiver Operating Characteristic (AUROC) curve and the area under the precision-recall curve. Shapley's additive explanatory values (SHAP values) were used to evaluate the overall contribution of each variable to the prediction of future STBs. Results were obtained separately by gender. Results: The prevalence of STBs in HCWs at the 4-month follow-up was 7.9% (women = 7.8%, men = 8.2%). Thirty-four variables were selected by the L1 regularized linear SVC. The best results were obtained without data balancing techniques: AUROC = 0.87 (0.86 for women and 0.87 for men) and area under the precision-recall curve = 0.50 (0.55 for women and 0.45 for men). Based on SHAP values, the most important baseline predictors for any STB at the 4-month follow-up were the presence of passive suicidal ideation, the number of days in the past 30 days with passive or active suicidal ideation, the number of days in the past 30 days with binge eating episodes, the number of panic attacks (women only) and the frequency of intrusive thoughts (men only). Conclusions: Machine learning-based prediction models for STBs in HCWs during the COVID-19 pandemic trained on web-based survey data present high discrimination and classification capacity. Future clinical implementations of this model could enable the early detection of HCWs at the highest risk for developing adverse mental health outcomes. Study registration: NCT04556565.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Alayo I, Pujol O, Alonso J, Ferrer M, Amigo F, Portillo-Van Diest A, et al. Identifying most important predictors for suicidal thoughts and behaviours among healthcare workers active during the Spain COVID-19 pandemic: a machine-learning approach. Epidemiol Psychiatr Sci. 2025 May 8;34:e28. DOI: 10.1017/S2045796025000198
  • dc.identifier.doi http://dx.doi.org/10.1017/S2045796025000198
  • dc.identifier.issn 2045-7960
  • dc.identifier.uri http://hdl.handle.net/10230/70676
  • dc.language.iso eng
  • dc.publisher Cambridge University Press
  • dc.relation.ispartof Epidemiol Psychiatr Sci. 2025 May 8;34:e28
  • dc.rights © The Author(s), 2025. Published by Cambridge University Press. This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by/4.0
  • dc.subject.keyword Attempted suicide
  • dc.subject.keyword Interpretability
  • dc.subject.keyword Machine learning
  • dc.subject.keyword Mental health
  • dc.subject.keyword Suicidal ideation
  • dc.title Identifying most important predictors for suicidal thoughts and behaviours among healthcare workers active during the Spain COVID-19 pandemic: a machine-learning approach
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion