On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction

Anton, Bernat; Besalú, Mireia; Fornés Crespo, Oriol, 1983-; Bonet Martínez, Jaume, 1982-; Molina, Alexis; Molina Fernández, Rubén; Cuevas, Gemma de las; Fernández Fuentes, Narcís; Oliva Miguel, Baldomero

On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Anton, Bernat
dc.contributor.author Besalú, Mireia
dc.contributor.author Fornés Crespo, Oriol, 1983-
dc.contributor.author Bonet Martínez, Jaume, 1982-
dc.contributor.author Molina, Alexis
dc.contributor.author Molina Fernández, Rubén
dc.contributor.author Cuevas, Gemma de las
dc.contributor.author Fernández Fuentes, Narcís
dc.contributor.author Oliva Miguel, Baldomero
dc.date.accessioned 2021-10-26T05:58:10Z
dc.date.available 2021-10-26T05:58:10Z
dc.date.issued 2021
dc.description.abstract Direct-coupling analysis (DCA) for studying the coevolution of residues in proteins has been widely used to predict the three-dimensional structure of a protein from its sequence. We present RADI/raDIMod, a variation of the original DCA algorithm that groups chemically equivalent residues combined with super-secondary structure motifs to model protein structures. Interestingly, the simplification produced by grouping amino acids into only two groups (polar and non-polar) is still representative of the physicochemical nature that characterizes the protein structure and it is in line with the role of hydrophobic forces in protein-folding funneling. As a result of a compressed alphabet, the number of sequences required for the multiple sequence alignment is reduced. The number of long-range contacts predicted is limited; therefore, our approach requires the use of neighboring sequence-positions. We use the prediction of secondary structure and motifs of super-secondary structures to predict local contacts. We use RADI and raDIMod, a fragment-based protein structure modelling, achieving near native conformations when the number of super-secondary motifs covers >30-50% of the sequence. Interestingly, although different contacts are predicted with different alphabets, they produce similar structures.
dc.description.sponsorship Spanish Ministry of Economy MINECO [BIO2014-57518-R, BIO2017-83591-R (FEDER, UE), BIO2017-85329-R (FEDER, UE)]; Generalitat de Catalunya [SGR17-1020].
dc.format.mimetype application/pdf
dc.identifier.citation Anton B, Besalú M, Fornes O, Bonet J, Molina A, Molina-Fernandez R, De Las Cuevas G, Fernandez-Fuentes N, Oliva B. On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction. NAR Genom Bioinform. 2021;3(2):lqab027. DOI: 10.1093/nargab/lqab027
dc.identifier.doi http://dx.doi.org/10.1093/nargab/lqab027
dc.identifier.issn 2631-9268
dc.identifier.uri http://hdl.handle.net/10230/48806
dc.language.iso eng
dc.publisher Oxford University Press
dc.relation.ispartof NAR Genom Bioinform. 2021;3(2):lqab027
dc.relation.projectID info:eu-repo/grantAgreement/ES/1PE/BIO2014-57518-R
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/BIO2017-85329-R
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/BIO2017-83591-R
dc.rights © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
dc.title On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Articles (Hospital del Mar Research Institute)
Articles (Departament de Medicina i Ciències de la Vida)