EGASP: the human ENCODE Genome Annotation Assessment Project
EGASP: the human ENCODE Genome Annotation Assessment Project
Citació
- Guigó R, Flicek P, Abril J F, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic V B, Birney E, Castelo R, Eyras E, Ucla C, Gingeras T R, Harrow J, Hubbard T, Lewis S E, Reese M G. EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biology. 2006;7 Supl 1:S2. DOI: 10.1186/gb-2006-7-s1-s2
Enllaç permanent
Descripció
Resum
Background: We present the results of EGASP, a community experiment to assess the state-ofthe-/nart in genome annotation within the ENCODE regions, which span 1% of the human genome/nsequence. The experiment had two major goals: the assessment of the accuracy of computational/nmethods to predict protein coding genes; and the overall assessment of the completeness of the/ncurrent human genome annotations as represented in the ENCODE regions. For the/ncomputational prediction assessment, eighteen groups contributed gene predictions. We/nevaluated these submissions against each other based on a ‘reference set’ of annotations/ngenerated as part of the GENCODE project. These annotations were not available to the/nprediction groups prior to the submission deadline, so that their predictions were blind and an/nexternal advisory committee could perform a fair assessment./nResults: The best methods had at least one gene transcript correctly predicted for close to 70%/nof the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account/nalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide/nlevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs/nrelying on mRNA and protein sequences were the most accurate in reproducing the manually/ncurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be/nverified./nConclusions: This is the first such experiment in human DNA, and we have followed the/nstandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the/nresults presented here contribute to the value of ongoing large-scale annotation projects and should/nguide further experimental methods when being scaled up to the entire human genome sequence.