Structural variation in 1,019 diverse humans based on long-read sequencing
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Schloissnig, Siegfried
- dc.contributor.author Sotelo-Fonseca, Jesus Emiliano
- dc.contributor.author Moreira Pinhal, Ricardo
- dc.contributor.author Cáceres Aguilar, Mario
- dc.contributor.author Rodríguez Martín, Bernardo
- dc.contributor.author Korbel, Jan O.
- dc.date.accessioned 2025-10-09T05:48:58Z
- dc.date.available 2025-10-09T05:48:58Z
- dc.date.issued 2025
- dc.description.abstract Genomic structural variants (SVs) contribute substantially to genetic diversity and human diseases1-4, yet remain under-characterized in population-scale cohorts5. Here we conducted long-read sequencing6 in 1,019 humans to construct an intermediate-coverage resource covering 26 populations from the 1000 Genomes Project. Integrating linear and graph genome-based analyses, we uncover over 100,000 sequence-resolved biallelic SVs and we genotype 300,000 multiallelic variable number of tandem repeats7, advancing SV characterization over short-read-based population-scale surveys3,4. We characterize deletions, duplications, insertions and inversions in distinct populations. Long interspersed nuclear element-1 (L1) and SINE-VNTR-Alu (SVA) retrotransposition activities mediate the transduction8,9 of unique sequence stretches in 5' or 3', depending on source mobile element class and locus. SV breakpoint analyses point to a spectrum of homology-mediated processes contributing to SV formation and recurrent deletion events. Our open-access resource underscores the value of long-read sequencing in advancing SV characterization and enables guiding variant prioritization in patient genomes.
- dc.description.sponsorship We thank all participants of the 1000 Genomes Project, without whom this project would not have been possible. We thank M. Zody, X. Zhao and other members of the HGSVC for valuable feedback, P. Ebert for assistance with data management, and J. Charest and the Vienna BioCenter Core Facilities for assistance with long-read DNA sequencing. We thank P. Contzen for preparing DNA for long-read sequencing of rare disease samples. We thank L. Vissers and A. Hoischen for generously allowing us to use published patient data and providing oversight for their interpretation. Moreover, we acknowledge the EMBL IT services, the Centre for Information and Media Technology at Heinrich Heine University Düsseldorf, the IT services at the IMP, as well as the CRG Core Technologies Programme for providing resources for data processing and analysis. Finally, we thank members of the International Genome Sample Resource (IGSR) for assistance with providing the open data releases for this study. Funding for sequence data production was provided by the MARVL initiative, a collaboration between the IMP, BI X and Boehringer Ingelheim. Additional funding came from the following sources: National Institutes of Health (NIH) (to J.O.K., T.M., grant no. U24HG007497), the Ministry of Culture and Science of the State of North Rhine-Westphalia (to T.M., grant no. PROFILNRW-2020-107-A), the German Research Foundation (to T.M., grant no. 525152594) and the GraphGenomes project funded by the BMBF (to T.M., grant no. 031L0184A, and J.O.K., grant no. 031L0184C). W.H. received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 101150006. B.R.-M. was supported by a Bridging Excellence Fellowship provided by the Life Science Alliance. The project also received the support of a fellowship from the ‘la Caixa’ Foundation (ID 100010434), with fellowship code ‘LCF/BQ/DI24/12070028’ (to J.E.S.-F.). Additional funding came from the EU Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 713673. We acknowledge support of the Spanish Ministry of Science and Innovation through the Centro de Excelencia Severo Ochoa (grant no. CEX2020-001049-S, MCIN/AEI /10.13039/501100011033), and the Generalitat de Catalunya through the CERCA programme. We received support with data management from the EMBL throughout this project. We thank the DFG Research Infrastructure West German Genome Center (grant no. 407493903) as part of the Next Generation Sequencing Competence Network (project 423957469) for the LRS support.
- dc.format.mimetype application/pdf
- dc.identifier.citation Schloissnig S, Pani S, Ebler J, Hain C, Tsapalou V, Söylev A, et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature. 2025 Aug;644(8076):442-52. DOI: 10.1038/s41586-025-09290-7
- dc.identifier.doi http://dx.doi.org/10.1038/s41586-025-09290-7
- dc.identifier.issn 0028-0836
- dc.identifier.uri http://hdl.handle.net/10230/71442
- dc.language.iso eng
- dc.publisher Nature Research
- dc.relation.ispartof Nature. 2025 Aug;644(8076):442-52
- dc.relation.projectID info:eu-repo/grantAgreement/EC/HE/101150006
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/713673
- dc.rights © The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Genome informatics
- dc.subject.keyword Genomics
- dc.subject.keyword Medical genetics
- dc.subject.keyword Structural variation
- dc.title Structural variation in 1,019 diverse humans based on long-read sequencing
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion