Prominent use of distal 5’ transcription start sites and discovery of a large number of additional exons in ENCODE regions

Mostra el registre parcial de l'element Denoeud, France Kapranov, Philipp Ucla, Catherine Frankish, Adam Castelo Valdueza, Robert Drenkow, Jorg Lagarde, Julien Alioto, Tyler Manzano, Caroline Chrast, Jacqueline Dike, Sujit Wyss, Carine Henrichsen, Charlotte N. Holroyd, Nancy Dickson, Mark C. Taylor, Ruth Hance, Zahra Foissac, Sylvain Myers, Richard M. Rogers, Jane Hubbard, Tim J. Harrow, Jennifer Guigó Serra, Roderic Gingeras, Thomas R. Antonarakis, Stylianos E. Reymond, Alexandre 2012-05-21T09:45:08Z 2012-05-21T09:45:08Z 2007
dc.identifier.citation Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J, Dike S, Wyss C, Henrichsen CN, Holroyd N, Dickson MC, Taylor R, Hance Z, Foissac S, Myers RM, Rogers J, Hubbard T, Harrow J, Guigó R, Gingeras TR, Antonarakis SE, Reymond A. Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 2007; 17(6): 746-59. DOI: 10.1101/gr.5660607
dc.identifier.issn 1088-9051
dc.description.abstract This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.
dc.language.iso eng
dc.publisher Cold Spring Harbor Laboratory Press-CSHL Press
dc.relation.ispartof Genome Res. 2007; 17(6): 746-59
dc.rights © 2007 Genome Research by Cold Spring Harbor Laboratory Press. Published version available at Aquest document està subjecte a Llicència Creative Commons (Attribution-NonCommercial 3.0 Unported License)
dc.subject.other Factors de transcripció
dc.subject.other Genoma humà
dc.type info:eu-repo/semantics/article
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion

