Welcome to the UPF Digital Repository

Analytic combinatorics for computing seeding probabilities

Show simple item record

dc.contributor.author Filion, Guillaume
dc.date.accessioned 2019-02-28T08:45:08Z
dc.date.available 2019-02-28T08:45:08Z
dc.date.issued 2018
dc.identifier.citation Filion GJ. Analytic combinatorics for computing seeding probabilities. Algorithms. 2018;11(1):3. DOI: 10.3390/a11010003
dc.identifier.issn 1999-4893
dc.identifier.uri http://hdl.handle.net/10230/36693
dc.description.abstract Seeding heuristics are the most widely used strategies to speed up sequence alignment in bioinformatics. Such strategies are most successful if they are calibrated, so that the speed-versus-accuracy trade-off can be properly tuned. In the widely used case of read mapping, it has been so far impossible to predict the success rate of competing seeding strategies for lack of a theoretical framework. Here, we present an approach to estimate such quantities based on the theory of analytic combinatorics. The strategy is to specify a combinatorial construction of reads where the seeding heuristic fails, translate this specification into a generating function using formal rules, and finally extract the probabilities of interest from the singularities of the generating function. The generating function can also be used to set up a simple recurrence to compute the probabilities with greater precision. We use this approach to construct simple estimators of the success rate of the seeding heuristic under different types of sequencing errors, and we show that the estimates are accurate in practical situations. More generally, this work shows novel strategies based on analytic combinatorics to compute probabilities of interest in bioinformatics.
dc.description.sponsorship I acknowledge the financial support of the Spanish Ministry of Economy and Competitiveness (‘Centro de Excelencia Severo Ochoa 2013-2017’, Plan Nacional BFU2012-37168), of the CERCA (Centres de Recerca de Catalunya) Programme / Generalitat de Catalunya, and of the European Research Council (Synergy Grant 609989).
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher MDPI
dc.relation.ispartof Algorithms. 2018;11(1):3
dc.rights © 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/)
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.title Analytic combinatorics for computing seeding probabilities
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.3390/a11010003
dc.subject.keyword Analytic combinatorics
dc.subject.keyword Bioinformatics
dc.subject.keyword Seeding sequence alignment
dc.subject.keyword Generating functions
dc.relation.projectID info:eu-repo/grantAgreement/ES/3PN/BFU2012-37168
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/609989
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics

In collaboration with Compliant to Partaking