Multiple sequence alignment computation using the T-Coffee regressive algorithm implementation

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Garriga, Edgar
  • dc.contributor.author Di Tommaso, Paolo
  • dc.contributor.author Magis, Cedrik
  • dc.contributor.author Erb, Ionas
  • dc.contributor.author Mansouri, Leila
  • dc.contributor.author Baltzis, Athanasios
  • dc.contributor.author Floden, Evan, 1985-
  • dc.contributor.author Notredame, Cedric
  • dc.date.accessioned 2021-03-15T11:48:44Z
  • dc.date.issued 2021
  • dc.description.abstract Many fields of biology rely on the inference of accurate multiple sequence alignments (MSA) of biological sequences. Unfortunately, the problem of assembling an MSA is NP-complete thus limiting computation to approximate solutions using heuristics solutions. The progressive algorithm is one of the most popular frameworks for the computation of MSAs. It involves pre-clustering the sequences and aligning them starting with the most similar ones. The scalability of this framework is limited, especially with respect to accuracy. We present here an alternative approach named regressive algorithm. In this framework, sequences are first clustered and then aligned starting with the most distantly related ones. This approach has been shown to greatly improve accuracy during scale-up, especially on datasets featuring 10,000 sequences or more. Another benefit is the possibility to integrate third-party clustering methods and third-party MSA aligners. The regressive algorithm has been tested on up to 1.5 million sequences, its implementation is available in the T-Coffee package.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Garriga E, Tommaso P, Magis C, Erb I, Mansouri L, Baltzis, A et al. Multiple sequence alignment computation using the T-Coffee regressive algorithm implementation. Methods Mol Biol. 2021; 2231: 89-97. DOI: 10.1007/978-1-0716-1036-7_6
  • dc.identifier.doi http://dx.doi.org/10.1007/978-1-0716-1036-7_6
  • dc.identifier.issn 1064-3745
  • dc.identifier.uri http://hdl.handle.net/10230/46777
  • dc.language.iso eng
  • dc.publisher Humana Press (Springer Imprint)
  • dc.relation.ispartof Methods in Molecular Biology. 2021;2231:89-97
  • dc.rights © Springer The final publication is available at Springer via http://dx.doi.org/10.1007/978-1-0716-1036-7_6
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.subject.other Alineament de seqüència (Bioinformàtica)
  • dc.title Multiple sequence alignment computation using the T-Coffee regressive algorithm implementation
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/acceptedVersion