A crowdsourcing workflow for extracting chemical-induced disease relations from free text.
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Li, Tong Shuca
- dc.contributor.author Bravo Serrano, Àlex, 1984-ca
- dc.contributor.author Furlong, Laura I., 1971-ca
- dc.contributor.author Good, Benjamin M.ca
- dc.contributor.author Su, Andrew I.ca
- dc.date.accessioned 2016-07-14T09:28:29Z
- dc.date.available 2016-07-14T09:28:29Z
- dc.date.issued 2016
- dc.description.abstract Relations between chemicals and diseases are one of the most queried biomedical interactions. Although expert manual curation is the standard method for extracting these relations from the literature, it is expensive and impractical to apply to large numbers of documents, and therefore alternative methods are required. We describe here a crowdsourcing workflow for extracting chemical-induced disease relations from free text as part of the BioCreative V Chemical Disease Relation challenge. Five non-expert workers on the CrowdFlower platform were shown each potential chemical-induced disease relation highlighted in the original source text and asked to make binary judgments about whether the text supported the relation. Worker responses were aggregated through voting, and relations receiving four or more votes were predicted as true. On the official evaluation dataset of 500 PubMed abstracts, the crowd attained a 0.505F-score (0.475 precision, 0.540 recall), with a maximum theoretical recall of 0.751 due to errors with named entity recognition. The total crowdsourcing cost was $1290.67 ($2.58 per abstract) and took a total of 7 h. A qualitative error analysis revealed that 46.66% of sampled errors were due to task limitations and gold standard errors, indicating that performance can still be improved. All code and results are publicly available athttps://github.com/SuLab/crowd_cid_relexDatabaseca
- dc.description.sponsorship This work was supported by grants from the National Institutes of Health (GM114833, GM089820, TR001114); the Instituto de Salud Carlos III-Fondo Europeo de Desarrollo Regional (PI13/00082 and CP10/00524 to A.B. and L.I.F.); the Innovative Medicines Initiative-Joint Undertaking (eTOX No. 115002, Open PHACTs No. 115191, EMIF No. 115372, iPiE No. 115735 to A.B. and L.I.F.), resources of which are composed of financial contributions from the European Union’s Seventh Framework Programme (FP7/2007-2013) and European Federation of Pharmaceutical Industries and Associations; and the European Union Horizon 2020 Programme 2014–2020 (MedBioinformatics No. 634143 and Elixir-Excelerate No. 676559 to A.B. and L.I.F.). The Research Programme on Biomedical Informatics (GRIB) is a node of the Spanish National Institute of Bioinformatics (INB). Open access charges will be paid by grant number TR001114 from the National Institutes of Health.
- dc.format.mimetype application/pdfca
- dc.identifier.citation Li TS, Bravo À, Furlong LI, Good BM, Su AI. A crowdsourcing workflow for extracting chemical-induced disease relations from free text. Database (Oxford). 2016 Apr 17;2016:baw051. DOI: 10.1093/database/baw051ca
- dc.identifier.doi http://dx.doi.org/10.1093/database/baw051
- dc.identifier.issn 1758-0463
- dc.identifier.uri http://hdl.handle.net/10230/27044
- dc.language.iso engca
- dc.publisher Oxford University Pressca
- dc.relation.ispartof Database (Oxford). 2016 Apr 17;2016:baw051
- dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/115002
- dc.rights VC The Author(s) 2016. Published by Oxford University Press. Page 1 of 11.This is an Open Access article distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly citedca
- dc.rights.accessRights info:eu-repo/semantics/openAccessca
- dc.rights.uri https://creativecommons.org/licenses/by/4.0/ca
- dc.subject.other Tecnologia -- Innovacionsca
- dc.subject.other Productes químics -- Salutca
- dc.title A crowdsourcing workflow for extracting chemical-induced disease relations from free text.ca
- dc.type info:eu-repo/semantics/articleca
- dc.type.version info:eu-repo/semantics/publishedVersionca