The data use ontology to streamline responsible access to human biomedical datasets

dc.date.issued 2021
dc.identifier.citation Lawson J, Cabili MN, Kerry G, Boughtwood T, Thorogood A, Alper P et al. The data use ontology to streamline responsible access to human biomedical datasets. Cell Genom. 2021 Nov 10;1(2):None. DOI: 10.1016/j.xgen.2021.100028
dc.description.abstract Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.
dc.description.sponsorship G.K., M.A.F., H.P., J.D.S., and M.C. were funded by EMBL-EBI Core Funds and Wellcome Trust GA4GH award number 201535/Z/16/Z. T. T. Boughtwood was funded by NHMRC GNT111353, GNT200001, and the Australian MRFF. P.A. was funded by ELIXIR Luxembourg. S.D. and K.R. were funded by the Broad Institute. M.A.H. and M.B. received funding from NIH #5R24OD011883. M.L. and M.C. were funded by the CINECA project (H2020 No 825775). N.M. and L.Z. were funded by H3ABioNet, NIH grant number U24HG006941. S.O. and C.Y. received funding from the Japan Agency for Medical Research and Development (AMED) under grant numbers JP19kk020501 and JP18kk0205012. A.A.P. was funded by NHGRI AnVIL, award number U24HG010262. F.P. was supported, in part, by the European Union’s Horizon 2020 research and innovation program under the EJP RD COFUND-EJP #825575. M.A.S. and E.J.v.E. were funded by FAIR genomes (ZonMW #846003201) and EOSC-Life (H2020 #824087)
dc.title The data use ontology to streamline responsible access to human biomedical datasets
