Predicting gene disease associations with knowledge graph embeddings for diseases with curtailed information
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Gualdi, Francesco
- dc.contributor.author Oliva Miguel, Baldomero
- dc.contributor.author Piñero González, Janet, 1977-
- dc.date.accessioned 2024-06-13T06:21:51Z
- dc.date.available 2024-06-13T06:21:51Z
- dc.date.issued 2024
- dc.description.abstract Knowledge graph embeddings (KGE) are a powerful technique used in the biomedical domain to represent biological knowledge in a low dimensional space. However, a deep understanding of these methods is still missing, and, in particular, regarding their applications to prioritize genes associated with complex diseases with reduced genetic information. In this contribution, we built a knowledge graph (KG) by integrating heterogeneous biomedical data and generated KGE by implementing state-of-the-art methods, and two novel algorithms: Dlemb and BioKG2vec. Extensive testing of the embeddings with unsupervised clustering and supervised methods showed that KGE can be successfully implemented to predict genes associated with diseases and that our novel approaches outperform most existing algorithms in both scenarios. Our findings underscore the significance of data quality, preprocessing, and integration in achieving accurate predictions. Additionally, we applied KGE to predict genes linked to Intervertebral Disc Degeneration (IDD) and illustrated that functions pertinent to the disease are enriched within the prioritized gene set.
- dc.description.sponsorship B.O. acknowledges support from MCIN and the AEI (DOI: 10.13039/501100011033) by grants PID2020-113203RB-I00 and ‘Unidad de Excelencia María de Maeztu’ (ref: CEX2018-000792-M)
- dc.format.mimetype application/pdf
- dc.identifier.citation Gualdi F, Oliva B, Piñero J. Predicting gene disease associations with knowledge graph embeddings for diseases with curtailed information. NAR Genom Bioinform. 2024 May 14;6(2):lqae049. DOI: 10.1093/nargab/lqae049
- dc.identifier.doi http://dx.doi.org/10.1093/nargab/lqae049
- dc.identifier.issn 2631-9268
- dc.identifier.uri http://hdl.handle.net/10230/60450
- dc.language.iso eng
- dc.publisher Oxford University Press
- dc.relation.ispartof NAR Genom Bioinform. 2024 May 14;6(2):lqae049
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/955735
- dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2020-113203RB-I00
- dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/CEX2018-000792-M
- dc.rights © The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.other Malalties congènites
- dc.subject.other Genètica mèdica
- dc.title Predicting gene disease associations with knowledge graph embeddings for diseases with curtailed information
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion