Welcome to the UPF Digital Repository

Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence

Show simple item record

dc.contributor.author Cheung, Joseph
dc.contributor.author Estivill, Xavier, 1955-
dc.contributor.author Khaja, Razi
dc.contributor.author MacDonald, Jeffrey R
dc.contributor.author Lau, Ken
dc.contributor.author Tsui, Lap-Chee
dc.contributor.author Scherer, Stephen W.
dc.date.accessioned 2014-12-18T10:46:04Z
dc.date.available 2014-12-18T10:46:04Z
dc.date.issued 2003
dc.identifier.citation Cheung J, Estivill X, Khaja R, MacDonald RJ, Lau K, Tsui L et al. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. Genome Biology. 2003 Mar;4(4):R25. DOI: 10.1186/gb-2003-4-4-r25
dc.identifier.issn 1465-6906
dc.identifier.uri http://hdl.handle.net/10230/22998
dc.description.abstract Background: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. Results: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants./nConclusion: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher BioMed Central
dc.relation.ispartof Genome Biology. 2003 Mar;4(4):R25
dc.rights © 2003 Cheung et al,; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL: http://dx.doi.org/10.1186/gb-2003-4-4-r25.
dc.subject.other Genoma humà
dc.subject.other Malalties
dc.title Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1186/gb-2003-4-4-r25
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account


In collaboration with Compliant to Partaking