EGASP: the human ENCODE Genome Annotation Assessment Project

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Guigó Serra, Rodericca
  • dc.contributor.author Flicek, Paulca
  • dc.contributor.author Abril Ferrando, Josep Francescca
  • dc.contributor.author Reymond, Alexandreca
  • dc.contributor.author Lagarde, Julienca
  • dc.contributor.author Denoeud, Franceca
  • dc.contributor.author Antonarakis, Stylianos E.ca
  • dc.contributor.author Ashburner, Michaelca
  • dc.contributor.author Bajic, Vladimir B.ca
  • dc.contributor.author Birney, Ewanca
  • dc.contributor.author Castelo Valdueza, Robertca
  • dc.contributor.author Eyras Jiménez, Eduardoca
  • dc.contributor.author Ucla, Catherineca
  • dc.contributor.author Gingeras, Thomas R.ca
  • dc.contributor.author Harrow, Jenniferca
  • dc.contributor.author Hubbard, Tim J.ca
  • dc.contributor.author Lewis, Suzanna E.ca
  • dc.contributor.author Reese, Martin G.ca
  • dc.date.accessioned 2011-11-28T11:07:59Z
  • dc.date.available 2011-11-28T11:07:59Z
  • dc.date.issued 2006ca
  • dc.description.abstract Background: We present the results of EGASP, a community experiment to assess the state-ofthe-/nart in genome annotation within the ENCODE regions, which span 1% of the human genome/nsequence. The experiment had two major goals: the assessment of the accuracy of computational/nmethods to predict protein coding genes; and the overall assessment of the completeness of the/ncurrent human genome annotations as represented in the ENCODE regions. For the/ncomputational prediction assessment, eighteen groups contributed gene predictions. We/nevaluated these submissions against each other based on a ‘reference set’ of annotations/ngenerated as part of the GENCODE project. These annotations were not available to the/nprediction groups prior to the submission deadline, so that their predictions were blind and an/nexternal advisory committee could perform a fair assessment./nResults: The best methods had at least one gene transcript correctly predicted for close to 70%/nof the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account/nalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide/nlevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs/nrelying on mRNA and protein sequences were the most accurate in reproducing the manually/ncurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be/nverified./nConclusions: This is the first such experiment in human DNA, and we have followed the/nstandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the/nresults presented here contribute to the value of ongoing large-scale annotation projects and should/nguide further experimental methods when being scaled up to the entire human genome sequence.en
  • dc.description.sponsorship RG is supported by grants from the NHGRI ENCODE Project, the European Biosapiens Project, and from the Spanish Ministry of Education and Science. PF is supported by EMBL. AR acknowledges the Swiss National Science Foundation for financial support. MR is partially supported by the NHGRI.
  • dc.format.mimetype application/pdfca
  • dc.identifier.citation Guigó R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biology. 2006;7 Supl 1:S2. DOI: 10.1186/gb-2006-7-s1-s2
  • dc.identifier.doi http://dx.doi.org/10.1186/gb-2006-7-s1-s2
  • dc.identifier.issn 1465-6906
  • dc.identifier.uri http://hdl.handle.net/10230/13146
  • dc.language.iso engca
  • dc.publisher BioMed Centralca
  • dc.relation.ispartof Guigó R, Reese M G, editors. EGASP '05: ENCODE genome annotation assessment Project. Genome biology. 2006;7 Supl 1.
  • dc.rights © 2006 BioMed Central Ltd./nThe electronic version of this article is the complete one and can be found online at http://genomebiology.com/2006/7/S1/S2ca
  • dc.rights.accessRights info:eu-repo/semantics/openAccessen
  • dc.subject.keyword ENCODE GASPen
  • dc.subject.keyword Alternative Splicingen
  • dc.subject.keyword Animalsen
  • dc.subject.keyword Computational Biologyen
  • dc.subject.keyword Genetic Databasesen
  • dc.subject.keyword Genesen
  • dc.subject.keyword Human Genomeen
  • dc.subject.keyword Genomicsen
  • dc.subject.keyword Humansen
  • dc.subject.keyword Miceen
  • dc.subject.keyword RNA Sequence Analysisen
  • dc.subject.keyword DNAen
  • dc.subject.other Bioinformàticaca
  • dc.subject.other Genomesca
  • dc.subject.other Biologia molecularca
  • dc.title EGASP: the human ENCODE Genome Annotation Assessment Projectca
  • dc.type info:eu-repo/semantics/articleca
  • dc.type.version info:eu-repo/semantics/publishedVersionen