EGASP: the human ENCODE Genome Annotation Assessment Project

dc.contributor.authorGuigó Serra, Rodericca
dc.contributor.authorFlicek, Paulca
dc.contributor.authorAbril Ferrando, Josep Francescca
dc.contributor.authorReymond, Alexandreca
dc.contributor.authorLagarde, Julienca
dc.contributor.authorDenoeud, Franceca
dc.contributor.authorAntonarakis, Stylianos E.ca
dc.contributor.authorAshburner, Michaelca
dc.contributor.authorBajic, Vladimir B.ca
dc.contributor.authorBirney, Ewanca
dc.contributor.authorCastelo Valdueza, Robertca
dc.contributor.authorEyras Jiménez, Eduardoca
dc.contributor.authorUcla, Catherineca
dc.contributor.authorGingeras, Thomas R.ca
dc.contributor.authorHarrow, Jenniferca
dc.contributor.authorHubbard, Tim J.ca
dc.contributor.authorLewis, Suzanna E.ca
dc.contributor.authorReese, Martin G.ca
dc.date.accessioned2011-11-28T11:07:59Z
dc.date.available2011-11-28T11:07:59Z
dc.date.issued2006ca
dc.description.abstractBackground: We present the results of EGASP, a community experiment to assess the state-ofthe-/nart in genome annotation within the ENCODE regions, which span 1% of the human genome/nsequence. The experiment had two major goals: the assessment of the accuracy of computational/nmethods to predict protein coding genes; and the overall assessment of the completeness of the/ncurrent human genome annotations as represented in the ENCODE regions. For the/ncomputational prediction assessment, eighteen groups contributed gene predictions. We/nevaluated these submissions against each other based on a ‘reference set’ of annotations/ngenerated as part of the GENCODE project. These annotations were not available to the/nprediction groups prior to the submission deadline, so that their predictions were blind and an/nexternal advisory committee could perform a fair assessment./nResults: The best methods had at least one gene transcript correctly predicted for close to 70%/nof the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account/nalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide/nlevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs/nrelying on mRNA and protein sequences were the most accurate in reproducing the manually/ncurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be/nverified./nConclusions: This is the first such experiment in human DNA, and we have followed the/nstandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the/nresults presented here contribute to the value of ongoing large-scale annotation projects and should/nguide further experimental methods when being scaled up to the entire human genome sequence.en
dc.description.sponsorshipRG is supported by grants from the NHGRI ENCODE Project, the European Biosapiens Project, and from the Spanish Ministry of Education and Science. PF is supported by EMBL. AR acknowledges the Swiss National Science Foundation for financial support. MR is partially supported by the NHGRI.
dc.format.mimetypeapplication/pdfca
dc.identifier.citationGuigó R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biology. 2006;7 Supl 1:S2. DOI: 10.1186/gb-2006-7-s1-s2
dc.identifier.doihttp://dx.doi.org/10.1186/gb-2006-7-s1-s2
dc.identifier.issn1465-6906
dc.identifier.urihttp://hdl.handle.net/10230/13146
dc.language.isoengca
dc.publisherBioMed Centralca
dc.relation.ispartofGuigó R, Reese M G, editors. EGASP '05: ENCODE genome annotation assessment Project. Genome biology. 2006;7 Supl 1.
dc.rights© 2006 BioMed Central Ltd./nThe electronic version of this article is the complete one and can be found online at http://genomebiology.com/2006/7/S1/S2ca
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessen
dc.subject.keywordENCODE GASPen
dc.subject.keywordAlternative Splicingen
dc.subject.keywordAnimalsen
dc.subject.keywordComputational Biologyen
dc.subject.keywordGenetic Databasesen
dc.subject.keywordGenesen
dc.subject.keywordHuman Genomeen
dc.subject.keywordGenomicsen
dc.subject.keywordHumansen
dc.subject.keywordMiceen
dc.subject.keywordRNA Sequence Analysisen
dc.subject.keywordDNAen
dc.subject.otherBioinformàticaca
dc.subject.otherGenomesca
dc.subject.otherBiologia molecularca
dc.titleEGASP: the human ENCODE Genome Annotation Assessment Projectca
dc.typeinfo:eu-repo/semantics/articleca
dc.type.versioninfo:eu-repo/semantics/publishedVersionen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Guigo_gb_4.pdf
Size:
2.09 MB
Format:
Adobe Portable Document Format