CARGO: effective format-free compressed storage of genomic information
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Roguski, Łukasz
- dc.contributor.author Ribeca, Paolo
- dc.date.accessioned 2024-12-09T07:24:55Z
- dc.date.available 2024-12-09T07:24:55Z
- dc.date.issued 2016
- dc.description Includes supplementary materials for the online appendix.
- dc.description.abstract The recent super-exponential growth in the amount of sequencing data generated worldwide has put techniques for compressed storage into the focus. Most available solutions, however, are strictly tied to specific bioinformatics formats, sometimes inheriting from them suboptimal design choices; this hinders flexible and effective data sharing. Here, we present CARGO (Compressed ARchiving for GenOmics), a high-level framework to automatically generate software systems optimized for the compressed storage of arbitrary types of large genomic data collections. Straightforward applications of our approach to FASTQ and SAM archives require a few lines of code, produce solutions that match and sometimes outperform specialized format-tailored compressors and scale well to multi-TB datasets. All CARGO software components can be freely downloaded for academic and non-commercial use from http://bio-cargo.sourceforge.net.
- dc.description.sponsorship Biotechnology and Biological Sciences Research Council (BBSRC) [grant BBS/E/I/00001942 to P.R.]; European Union's Seventh Framework Programme (FP7/2007-2013) [grant agreement No. 305444 (RD-Connect) to Ł.R.]; European Union's European Social Fund [INTERKADRA project UDAPOKL-04.01.01-00-014/10-00 to Ł.R.]. Funding for open access charge: CNAG (own funds) and BBSRC (grant for article processing charges to The Pirbright Institute).
- dc.format.mimetype application/pdf
- dc.identifier.citation Roguski Ł, Ribeca P. CARGO: effective format-free compressed storage of genomic information. Nucleic Acids Res. 2016 Jul 8;44(12):e114. DOI: 10.1093/nar/gkw318
- dc.identifier.doi http://dx.doi.org/10.1093/nar/gkw318
- dc.identifier.issn 0305-1048
- dc.identifier.uri http://hdl.handle.net/10230/68943
- dc.language.iso eng
- dc.publisher Oxford University Press
- dc.relation.ispartof Nucleic Acids Research. 2016 Jul 8;44(12):e114
- dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/305444
- dc.rights © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.other Informació -- Sistemes d'emmagatzematge i recuperació
- dc.subject.other Genòmica
- dc.subject.other Dades -- Compressió (Informàtica)
- dc.title CARGO: effective format-free compressed storage of genomic information
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion