Massive experimental quantification allows interpretable deep learning of protein aggregation

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Thompson, Mike
  • dc.contributor.author Martín, Mariano
  • dc.contributor.author Sanmartín Olmo, Trinidad
  • dc.contributor.author Rajesh, Chandana
  • dc.contributor.author Koo, Peter K.
  • dc.contributor.author Bolognesi, Benedetta
  • dc.contributor.author Lehner, Ben, 1978-
  • dc.date.accessioned 2025-09-08T11:48:03Z
  • dc.date.available 2025-09-08T11:48:03Z
  • dc.date.issued 2025
  • dc.description.abstract Protein aggregation is a pathological hallmark of more than 50 human diseases and a major problem for biotechnology. Methods have been proposed to predict aggregation from sequence, but these have been trained and evaluated on small and biased experimental datasets. Here we directly address this data shortage by experimentally quantifying the aggregation of >100,000 protein sequences. This unprecedented dataset reveals the limited performance of existing computational methods and allows us to train CANYA, a convolution-attention hybrid neural network that accurately predicts aggregation from sequence. We adapt genomic neural network interpretability analyses to reveal CANYA's decision-making process and learned grammar. Our results illustrate the power of massive experimental analysis of random sequence-spaces and provide an interpretable and robust neural network model to predict aggregation.
  • dc.description.sponsorship This work received support from the following: “La Caixa” Foundation (ID 100010434) under grant agreement LCF/PR/HR21/52410004 (B.L.); European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme grant agreement 883742 (B.L.); AXA Research Fund AXA Chair in Risk prediction in age-related diseases (B.L.); Secretariat of Universities and Research, Ministry of Enterprise and Knowledge of the Government of Catalonia and the European Social Funds 2017 SGR 1322 (B.L.); Bettencourt Schueller Foundation (B.L.); PID2023-146685NB-I00 funded by MCIN/AEI/10.13039/501100011033/ FEDER, UE; Wellcome 220540/Z/20/A, “Wellcome Sanger Institute Quinquennial Review 2021-2026” (B.L.); Spanish Ministry of Science, Innovation and Universities PID2021-127761OB-I00 (B.B.) RYC2020-028861-I funded by MCIN/AEI/ 10.13039/501100011033 “ERDF A way of making Europe” and “ESF Investing in your future” (B.B.); European Union (ERC Consolidator, Glam-MAP, 101125484) (B.B.); EMBO Fellowship ALTF 266-2023 (M.T.); NIH grant R01HG012131 (P.K. and C.R.); and NIH grant R01GM149921 (P.K. and C.R.).
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Thompson M, Martín M, Olmo TS, Rajesh C, Koo PK, Bolognesi B, et al. Massive experimental quantification allows interpretable deep learning of protein aggregation. Sci Adv. 2025 May 2;11(18):eadt5111. DOI: 10.1126/sciadv.adt5111
  • dc.identifier.doi http://dx.doi.org/10.1126/sciadv.adt5111
  • dc.identifier.issn 2375-2548
  • dc.identifier.uri http://hdl.handle.net/10230/71148
  • dc.language.iso eng
  • dc.publisher American Association for the Advancement of Science (AAAS)
  • dc.relation.ispartof Sci Adv. 2025 May 2;11(18):eadt5111
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/883742
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/PID2023-146685NB-I00
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/PID2021-127761OB-I00
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/HE/101125484
  • dc.rights This is an open-access article distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by/4.0/
  • dc.subject.other Proteïnes--Agregació
  • dc.title Massive experimental quantification allows interpretable deep learning of protein aggregation
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion