The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci

Welcome to the UPF Digital Repository

Rozowsky, Joel S.; Newburger D, Sayward F, Wu J, Jordan G, Korbel JO, Nagalakshmi U, Yang J, Zheng D, Guigó R, Gingeras TR, Weissman S, Miller P, Snyder M, Gerstein MB. The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci. Genome Res. 2007; 17(6): 732-45. DOI: 10.1101/gr.5696007
http://hdl.handle.net/10230/16475
To cite or link this document: http://hdl.handle.net/10230/16475
dc.contributor.author Rozowsky, Joel S.
dc.contributor.author Newburger, Daniel
dc.contributor.author Sayward, Fred
dc.contributor.author Wu, Jiaqian
dc.contributor.author Jordan, Greg
dc.contributor.author Korbel, Jan O.
dc.contributor.author Nagalakshmi, Ugrappa
dc.contributor.author Yang, Jing
dc.contributor.author Zheng, Deyou
dc.contributor.author Guigó Serra, Roderic
dc.contributor.author Gingeras, Thomas R.
dc.contributor.author Weissman, Sherman
dc.contributor.author Miller, Perry
dc.contributor.author Snyder, Michael
dc.contributor.author Gerstein, Mark B.
dc.date.accessioned 2012-05-21T09:45:04Z
dc.date.available 2012-05-21T09:45:04Z
dc.date.issued 2007
dc.identifier.citation Rozowsky, Joel S.; Newburger D, Sayward F, Wu J, Jordan G, Korbel JO, Nagalakshmi U, Yang J, Zheng D, Guigó R, Gingeras TR, Weissman S, Miller P, Snyder M, Gerstein MB. The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci. Genome Res. 2007; 17(6): 732-45. DOI: 10.1101/gr.5696007
dc.identifier.issn 1088-9051
dc.identifier.uri http://hdl.handle.net/10230/16475
dc.description.abstract For the ∼1% of the human genome in the ENCODE regions, only about half of the transcriptionally active regions (TARs) identified with tiling microarrays correspond to annotated exons. Here we categorize this large amount of “unannotated transcription.” We use a number of disparate features to classify the 6988 novel TARs—array expression profiles across cell lines and conditions, sequence composition, phylogenetic profiles (presence/absence of syntenic conservation across 17 species), and locations relative to genes. In the classification, we first filter out TARs with unusual sequence composition and those likely resulting from cross-hybridization. We then associate some of those remaining with proximal exons having correlated expression profiles. Finally, we cluster unclassified TARs into putative novel loci, based on similar expression and phylogenetic profiles. To encapsulate our classification, we construct a Database of Active Regions and Tools (DART.gersteinlab.org). DART has special facilities for rapidly handling and comparing many sets of TARs and their heterogeneous features, synchronizing across builds, and interfacing with other resources. Overall, we find that ∼14% of the novel TARs can be associated with known genes, while ∼21% can be clustered into ∼200 novel loci. We observe that TARs associated with genes are enriched in the potential to form structural RNAs and many novel TAR clusters are associated with nearby promoters. To benchmark our classification, we design a set of experiments for testing the connectivity of novel TARs. Overall, we find that 18 of the 46 connections tested validate by RT-PCR and four of five sequenced PCR products confirm connectivity unambiguously.
dc.language.iso eng
dc.publisher Cold Spring Harbor Laboratory Press-CSHL Press
dc.relation.ispartof Genome Res. 2007; 17(6): 732-45
dc.rights © 2007 Genome Research by Cold Spring Harbor Laboratory Press. Published version available at http://genome.cshlp.org. Aquest document està subjecte a Llicència Creative Commons (Attribution-NonCommercial 3.0 Unported License)
dc.rights.uri http://creativecommons.org/licenses/by-nc/3.0/
dc.subject.other Genètica humana
dc.subject.other Factors de transcripció
dc.subject.other Genètica humana -- Informàtica
dc.title The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1101/gr.5696007
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


See full text
This document is licensed under a Creative Commons license:

Search


Advanced Search

Browse

My Account

Statistics