Welcome to the UPF Digital Repository

CROSSMAPPER: estimating cross-mapping rates and optimizing experimental design in multi-species sequencing studies

Show simple item record

dc.contributor.author Hovhannisyan, Hrant, 1992-
dc.contributor.author Hafez Khafaga, Ahmed Ibrahem, 1987-
dc.contributor.author Llorens, Carlos, 1968-
dc.contributor.author Gabaldón Estevan, Juan Antonio, 1973-
dc.date.accessioned 2020-03-17T07:33:35Z
dc.date.available 2020-03-17T07:33:35Z
dc.date.issued 2020
dc.identifier.citation Hovhannisyan H, Hafez A, Llorens C, Gabaldón T. CROSSMAPPER: estimating cross-mapping rates and optimizing experimental design in multi-species sequencing studies. Bioinformatics. 2020 Feb 1; 36(3): 925-927. DOI: 10.1093/bioinformatics/btz626
dc.identifier.issn 1367-4803
dc.identifier.uri http://hdl.handle.net/10230/43908
dc.description.abstract MOTIVATION: Numerous sequencing studies, including transcriptomics of host-pathogen systems, sequencing of hybrid genomes, xenografts, mixed species systems, metagenomics and meta-transcriptomics, involve samples containing genetic material from divergent organisms. A crucial step in these studies is identifying from which organism each sequencing read originated, and the experimental design should be directed to minimize biases caused by cross-mapping of reads to incorrect source genomes. Additionally, pooling of sufficiently different genetic material into a single sequencing library could significantly reduce experimental costs but requires careful planning and assessment of the impact of cross-mapping. Having these applications in mind we designed Crossmapper, the first to our knowledge tool able to assess cross-mapping prior to sequencing, therefore allowing optimization of experimental design. RESULTS: Using any combination of reference genomes, Crossmapper performs read simulation and back-mapping of those reads to the pool of references, quantifies and reports the cross-mapping rates for each organism. Crossmapper performs these analyses with numerous user-specified parameters, including, among others, read length, read layout, coverage, mapping parameters, genomic or transcriptomic data. Additionally, it outputs the results in highly interactive and publication-ready reports. This allows the user to perform multiple comparisons at once and choose the experimental setup minimizing cross-mapping rates. Moreover, Crossmapper can be used for resource optimization in sequencing facilities by pooling different samples into one sequencing library. AVAILABILITY AND IMPLEMENTATION: Crossmapper is a command line tool implemented in Python 3.6 and available as a conda package, allowing effortless installation. The source code, detailed information and a step-by-step tutorial is available at our GitHub page https://github.com/Gabaldonlab/crossmapper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
dc.description.sponsorship This work was supported by the Spanish Ministry of Economy, Industry and Competitiveness (MEIC) for the EMBL partnership and the grant ‘Centro de Excelencia Severo Ochoa’ SEV-2012-0208 cofounded by European Regional Development Fund (ERDF); from the CERCA Programme/Generalitat de Catalunya; from the Catalan Research Agency (AGAUR) SGR857 and grants from the European Union’s Horizon 2020 research and innovation programme under the grant agreement ERC-2016-724173 and the Marie Sklodowska-Curie grant agreement No H2020-MSCA-ITN-2014-642095. The group also receives support from a INB Grant (PT17/0009/0023–ISCIII-SGEFI/ERDF)
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher Oxford University Press
dc.relation.ispartof Bioinformatics. 2020 Feb 1; 36(3): 925-7
dc.rights © 2019 Hrant Hovhannisyan et al. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License,which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited
dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
dc.subject.other Anàlisis de seqüències
dc.subject.other Genètica
dc.subject.other Genomes
dc.title CROSSMAPPER: estimating cross-mapping rates and optimizing experimental design in multi-species sequencing studies
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1093/bioinformatics/btz626
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/724173
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/642095
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account


In collaboration with Compliant to Partaking