A machine learning approach to identify groups of patients with hematological malignant disorders

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Rodríguez-Belenguer, Pablo
  • dc.contributor.author Piñana, José Luis
  • dc.contributor.author Sánchez Montañés, Manuel
  • dc.contributor.author Soria Olivas, Emilio
  • dc.contributor.author Martínez Sober, Marcelino
  • dc.contributor.author Serrano López, Antonio J.
  • dc.date.accessioned 2024-10-11T06:19:32Z
  • dc.date.available 2024-10-11T06:19:32Z
  • dc.date.issued 2024
  • dc.description.abstract Background and objective: Vaccination against SARS-CoV-2 in immunocompromised patients with hematologic malignancies (HM) is crucial to reduce the severity of COVID-19. Despite vaccination efforts, over a third of HM patients remain unresponsive, increasing their risk of severe breakthrough infections. This study aims to leverage machine learning's adaptability to COVID-19 dynamics, efficiently selecting patient-specific features to enhance predictions and improve healthcare strategies. Highlighting the complex COVID-hematology connection, the focus is on interpretable machine learning to provide valuable insights to clinicians and biologists. Methods: The study evaluated a dataset with 1166 patients with hematological diseases. The output was the achievement or non-achievement of a serological response after full COVID-19 vaccination. Various machine learning methods were applied, with the best model selected based on metrics such as the Area Under the Curve (AUC), Sensitivity, Specificity, and Matthew Correlation Coefficient (MCC). Individual SHAP values were obtained for the best model, and Principal Component Analysis (PCA) was applied to these values. The patient profiles were then analyzed within identified clusters. Results: Support vector machine (SVM) emerged as the best-performing model. PCA applied to SVM-derived SHAP values resulted in four perfectly separated clusters. These clusters are characterized by the proportion of patients that generate antibodies (PPGA). Cluster 1, with the second-highest PPGA (69.91%), included patients with aggressive diseases and factors contributing to increased immunodeficiency. Cluster 2 had the lowest PPGA (33.3%), but the small sample size limited conclusive findings. Cluster 3, representing the majority of the population, exhibited a high rate of antibody generation (84.39%) and a better prognosis compared to cluster 1. Cluster 4, with a PPGA of 66.33%, included patients with B-cell non-Hodgkin's lymphoma on corticosteroid therapy. Conclusions: The methodology successfully identified four separate patient clusters using Machine Learning and Explainable AI (XAI). We then analyzed each cluster based on the percentage of HM patients who generated antibodies after COVID-19 vaccination. The study suggests the methodology's potential applicability to other diseases, highlighting the importance of interpretable ML in healthcare research and decision-making.
  • dc.description.sponsorship We thank the Spanish Hematopoietic Transplant and Cell Therapy group (GETH-TC) for sharing the data. The co-author Manuel Sánchez-Montañés has been supported by grants PID2021-122347NB-I00 and PID2021-127946OB-I00 (funded by MCIN/AEI//10.13039/501100011033 and ERDF “A way of making Europe”), and from project FACINGLCOVID-CM under grant PD2022-004-REACT-EU.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Rodríguez-Belenguer P, Piñana JL, Sánchez-Montañés M, Soria-Olivas E, Martínez-Sober M, Serrano-López AJ. A machine learning approach to identify groups of patients with hematological malignant disorders. Comput Methods Programs Biomed. 2024 Apr;246:108011. DOI: 10.1016/j.cmpb.2024.108011
  • dc.identifier.doi http://dx.doi.org/10.1016/j.cmpb.2024.108011
  • dc.identifier.issn 0169-2607
  • dc.identifier.uri http://hdl.handle.net/10230/61371
  • dc.language.iso eng
  • dc.publisher Elsevier
  • dc.relation.ispartof Comput Methods Programs Biomed. 2024 Apr;246:108011
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/PID2021-122347NB-I00
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/PID2021-127946OB-I00
  • dc.rights © 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
  • dc.subject.keyword COVID-19
  • dc.subject.keyword Explainable AI (XAI)
  • dc.subject.keyword Hematological disease
  • dc.subject.keyword High risk groups identification
  • dc.subject.keyword Machine learning
  • dc.subject.keyword SARS-CoV-2 mRNA vaccines
  • dc.subject.keyword Serological response
  • dc.title A machine learning approach to identify groups of patients with hematological malignant disorders
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion