FA-nf: a functional annotation pipeline for proteins from non-model organisms implemented in nextflow
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Vlasova, Anna
- dc.contributor.author Hermoso Pulido, Antonio
- dc.contributor.author Camara, Francisco
- dc.contributor.author Ponomarenko, Julia
- dc.contributor.author Guigó Serra, Roderic
- dc.date.accessioned 2021-12-09T07:01:52Z
- dc.date.available 2021-12-09T07:01:52Z
- dc.date.issued 2021
- dc.description.abstract Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility.
- dc.description.sponsorship The research leading to these results has received funding from the Plataforma de Recursos Biomoleculares y Bioinformáticos PT 13/0001/0021 from ISCIII, a platform co-funded by the European Regional Development Fund (FEDER) and from the Plan Estatal project funded under grant number PGC2018-094017-B-I00 from the Spanish Ministry of Science and Innovation (AEI/FEDER). We acknowledge support of the Spanish Ministry of Science and Innovation to the EMBL partnership, the Centro de Excelencia Severo Ochoa and the CERCA Programme/Generalitat de Catalunya.
- dc.format.mimetype application/pdf
- dc.identifier.citation Vlasova A, Hermoso Pulido T, Camara F, Ponomarenko J, Guigó R. FA-nf: a functional annotation pipeline for proteins from non-model organisms implemented in nextflow. Genes (Basel). 2021;12(10):1645. DOI: 10.3390/genes12101645
- dc.identifier.doi http://dx.doi.org/10.3390/genes12101645
- dc.identifier.issn 2073-4425
- dc.identifier.uri http://hdl.handle.net/10230/49158
- dc.language.iso eng
- dc.publisher MDPI
- dc.relation.ispartof Genes (Basel). 2021;12(10):1645
- dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PGC2018-094017-B-I00
- dc.rights © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Containerization
- dc.subject.keyword Functional annotation
- dc.subject.keyword Pipeline
- dc.subject.keyword Reproducibility
- dc.title FA-nf: a functional annotation pipeline for proteins from non-model organisms implemented in nextflow
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion