EGASP: the human ENCODE Genome Annotation Assessment Project

Welcome to the UPF Digital Repository

Guigó R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biology. 2006; 7 Supl 1: S2. DOI 10.1186/gb-2006-7-s1-s2
http://hdl.handle.net/10230/13146
To cite or link this document: http://hdl.handle.net/10230/13146
dc.contributor.author Guigó Serra, Roderic
dc.contributor.author Flicek, Paul
dc.contributor.author Abril Ferrando, Josep Francesc
dc.contributor.author Reymond, Alexandre
dc.contributor.author Lagarde, Julien
dc.contributor.author Denoeud, France
dc.contributor.author Antonarakis, Stylianos E.
dc.contributor.author Ashburner, Michael
dc.contributor.author Bajic, Vladimir B.
dc.contributor.author Birney, Ewan
dc.contributor.author Castelo, Robert
dc.contributor.author Eyras Jiménez, Eduardo
dc.contributor.author Ucla, Catherine
dc.contributor.author Gingeras, Thomas R.
dc.contributor.author Harrow, Jennifer
dc.contributor.author Hubbard, Tim J.
dc.contributor.author Lewis, Suzanna E.
dc.contributor.author Reese, Martin G.
dc.contributor.other Universitat Pompeu Fabra
dc.date.accessioned 2011-11-28T11:07:59Z
dc.date.available 2011-11-28T11:07:59Z
dc.date.issued 2006
dc.identifier.citation Guigó R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biology. 2006; 7 Supl 1: S2. DOI 10.1186/gb-2006-7-s1-s2
dc.identifier.issn 1465-6906
dc.identifier.uri http://hdl.handle.net/10230/13146
dc.description.abstract Background: We present the results of EGASP, a community experiment to assess the state-ofthe- art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a ‘reference set’ of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. Results: The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. Conclusions: This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.
dc.language.iso eng
dc.publisher BioMed Central
dc.relation.ispartof Guigó R, Reese M G, editors. EGASP '05: ENCODE genome annotation assessment Project. Genome biology. 2006; 7 Supl 1.
dc.rights © 2006 BioMed Central Ltd. The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2006/7/S1/S2
dc.subject.other Bioinformàtica
dc.subject.other Genomes
dc.subject.other Biologia molecular
dc.title EGASP: the human ENCODE Genome Annotation Assessment Project
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1186/gb-2006-7-s1-s2
dc.subject.keyword ENCODE GASP
dc.subject.keyword Alternative Splicing
dc.subject.keyword Animals
dc.subject.keyword Computational Biology
dc.subject.keyword Genetic Databases
dc.subject.keyword Genes
dc.subject.keyword Human Genome
dc.subject.keyword Genomics
dc.subject.keyword Humans
dc.subject.keyword Mice
dc.subject.keyword RNA Sequence Analysis
dc.subject.keyword DNA
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


See full text

Search


Advanced Search

Browse

My Account

Statistics