Site-saturation mutagenesis of 500 human protein domains
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Beltran, Antoni
- dc.contributor.author Jiang, Xiang'er
- dc.contributor.author Shen, Yue
- dc.contributor.author Lehner, Ben, 1978-
- dc.date.accessioned 2025-03-25T07:25:42Z
- dc.date.available 2025-03-25T07:25:42Z
- dc.date.issued 2025
- dc.description.abstract Missense variants that change the amino acid sequences of proteins cause one-third of human genetic diseases1. Tens of millions of missense variants exist in the current human population, and the vast majority of these have unknown functional consequences. Here we present a large-scale experimental analysis of human missense variants across many different proteins. Using DNA synthesis and cellular selection experiments we quantify the effect of more than 500,000 variants on the abundance of more than 500 human protein domains. This dataset reveals that 60% of pathogenic missense variants reduce protein stability. The contribution of stability to protein fitness varies across proteins and diseases and is particularly important in recessive disorders. We combine stability measurements with protein language models to annotate functional sites across proteins. Mutational effects on stability are largely conserved in homologous domains, enabling accurate stability prediction across entire protein families using energy models. Our data demonstrate the feasibility of assaying human protein variants at scale and provides a large consistent reference dataset for clinical variant interpretation and training and benchmarking of computational methods.
- dc.description.sponsorship A.B. and B.L. were funded by a European Research Council (ERC) Advanced (883742) grant, the Spanish Ministry of Science and Innovation (LCF/PR/HR21/52410004, EMBL Partnership, Severo Ochoa Centre of Excellence), the Bettencourt Schueller Foundation, the AXA Research Fund, Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR, 2017 SGR 1322), and the CERCA Program/Generalitat de Catalunya. A.B. was funded by an EMBO fellowship (ALTF 183-2020) and by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement 101030961. X.J. and Y.S. were funded by the National Natural Science Foundation of China (32322047) and Jiangsu Provincial Department of Science and Technology (BM2023009). We thank all members of the Lehner laboratory for helpful discussions and suggestions.
- dc.format.mimetype application/pdf
- dc.identifier.citation Beltran A, Jiang X, Shen Y, Lehner B. Site-saturation mutagenesis of 500 human protein domains. Nature. 2025 Jan;637(8047):885-94. DOI: 10.1038/s41586-024-08370-4
- dc.identifier.doi http://dx.doi.org/10.1038/s41586-024-08370-4
- dc.identifier.issn 0028-0836
- dc.identifier.uri http://hdl.handle.net/10230/70003
- dc.language.iso eng
- dc.publisher Nature Research
- dc.relation.ispartof Nature. 2025 Jan;637(8047):885-94
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/883742
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/101030961
- dc.rights © The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Clinical genetics
- dc.subject.keyword Computational biology and bioinformatics
- dc.subject.keyword Genomics
- dc.subject.keyword High-throughput screening
- dc.subject.keyword Protein folding
- dc.title Site-saturation mutagenesis of 500 human protein domains
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion