Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Perez, Naiara
  • dc.contributor.author Accuosto, Pablo
  • dc.contributor.author Bravo Serrano, Àlex, 1984-
  • dc.contributor.author Cuadros Oller, Montse
  • dc.contributor.author Martínez-Garcia, Eva
  • dc.contributor.author Saggion, Horacio
  • dc.contributor.author Rigau Claramunt, German
  • dc.date.accessioned 2019-12-10T09:38:20Z
  • dc.date.available 2019-12-10T09:38:20Z
  • dc.date.issued 2019
  • dc.description.abstract Motivation Biomedical literature is one of the most relevant sources of information for knowledge mining in the field of Bioinformatics. In spite of English being the most widely addressed language in the field; in recent years, there has been a growing interest from the natural language processing community in dealing with languages other than English. However, the availability of language resources and tools for appropriate treatment of non-English texts is lacking behind. Our research is concerned with the semantic annotation of biomedical texts in the Spanish language, which can be considered an under-resourced language where biomedical text processing is concerned. Results We have carried out experiments to assess the effectiveness of several methods for the automatic annotation of biomedical texts in Spanish. One approach is based on the linguistic analysis of Spanish texts and their annotation using an information retrieval and concept disambiguation approach. A second method takes advantage of a Spanish–English machine translation process to annotate English documents and transfer annotations back to Spanish. A third method takes advantage of the combination of both procedures. Our evaluation shows that a combined system has competitive advantages over the two individual procedures. Availability and implementation UMLSmapper (https://snlt.vicomtech.org/umlsmapper) and the annotation transfer tool (http://scientmin.taln.upf.edu/anntransfer/) are freely available for research purposes as web services and/or demos.
  • dc.description.sponsorship Our work is partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502) and the projects CROSSTEXT (TIN2015-72646-EXP, MINECO/FEDER, UE) and DeepReading (RTI2018-096846-B-C21 MCIU/AEI/FEDER, UE).
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Perez N, Accuosto P, Bravo A, Cuadros M, Martínez-Garcia E, Saggion H, Rigau G. Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English. Bioinformatics. 2019 Nov 15;36(6):1872-80. DOI: 10.1093/bioinformatics/btz853
  • dc.identifier.doi http://dx.doi.org/10.1093/bioinformatics/btz853
  • dc.identifier.issn 1367-4803
  • dc.identifier.uri http://hdl.handle.net/10230/43132
  • dc.language.iso eng
  • dc.publisher Oxford University Press
  • dc.relation.ispartof Bioinformatics. 2019 Nov 15;36(6):1872-80.
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/1PE/TIN2015-72646
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/RTI2018-096846-B-C21
  • dc.rights © Oxford University Press. This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record Perez N, Accuosto P, Bravo A, Cuadros M, Martínez-Garcia E, Saggion H, Rigau G. Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English. Bioinformatics. 2019 Nov 15. is available online at: https://doi.org/10.1093/bioinformatics/btz853
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.title Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/acceptedVersion