Use of ChIP-Seq data for the design of a multiple promoter-alignment method

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Erb, Ionasca
  • dc.contributor.author González-Vallinas Rostes, Juan, 1983-ca
  • dc.contributor.author Bussotti, Giovanni, 1983-ca
  • dc.contributor.author Blanco, Enriqueca
  • dc.contributor.author Eyras Jiménez, Eduardoca
  • dc.contributor.author Notredame, Cedricca
  • dc.date.accessioned 2015-04-07T09:36:10Z
  • dc.date.available 2015-04-07T09:36:10Z
  • dc.date.issued 2012ca
  • dc.description.abstract We address the challenge of regulatory sequence alignment with a new method, Pro-Coffee, a multiple aligner specifically designed for homologous promoter regions. Pro-Coffee uses a dinucleotide substitution matrix estimated on alignments of functional binding sites from TRANSFAC. We designed a validation framework using several thousand families of orthologous promoters. This dataset was used to evaluate the accuracy for predicting true human orthologs among their paralogs. We found that whereas other methods achieve on average 73.5% accuracy, and 77.6% when trained on that same dataset, the figure goes up to 80.4% for Pro-Coffee. We then applied a novel validation procedure based on multi-species ChIP-seq data. Trained and untrained methods were tested for their capacity to correctly align experimentally detected binding sites. Whereas the average number of correctly aligned sites for two transcription factors is 284 for default methods and 316 for trained methods, Pro-Coffee achieves 331, 16.5% above the default average. We find a high correlation between a method's performance when classifying orthologs and its ability to correctly align proven binding sites. Not only has this interesting biological consequences, it also allows us to conclude that any method that is trained on the ortholog data set will result in functionally more informative alignments.en
  • dc.description.sponsorship Funding for open access charge: The Centre for Genomic Regulation(CRG)(toC.N.); thePlanNacional(BFU2008-00419) (to I.E. and E.R.); ‘La Caixa’ international PhD program fellowships (to G.B.). This work was also co-financed by the European Commission, within the 7th Framework Programme (Grant Agreement KBBE-2A-222664) (‘Quantomics’)en
  • dc.format.mimetype application/pdfca
  • dc.identifier.citation Erb I, Gonzalez-Vallinas JR, Bussotti G, Blanco E, Eyras E, Notredame C. Use of ChIP-Seq data for the design of a multiple promoter-alignment method. Nucleic Acids Research. 2012; 40(7): e52. DOI 10.1093/nar/gkr1292ca
  • dc.identifier.doi http://dx.doi.org/10.1093/nar/gkr1292
  • dc.identifier.issn 0305-1048ca
  • dc.identifier.uri http://hdl.handle.net/10230/23341
  • dc.language.iso engca
  • dc.publisher Oxford University Pressca
  • dc.relation.ispartof Nucleic Acids Research. 2012;40(7):e52
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/222664ca
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PN/BFU2008-00419
  • dc.rights © Erb I, Gonzalez-Vallinas JR, Bussotti G, Blanco E, Eyras E, Notredame C [2012]. Published by Oxford University Press. This is an Open Access article distributed under the terms of a Creative Commons Attribution Licenseca
  • dc.rights.accessRights info:eu-repo/semantics/openAccessca
  • dc.rights.uri http://creativecommons.org/licenses/by-nc/3.0
  • dc.subject.other Evolució molecularca
  • dc.subject.other Seqüència de nucleòtidsca
  • dc.subject.other Genòmicaca
  • dc.title Use of ChIP-Seq data for the design of a multiple promoter-alignment methoden
  • dc.type info:eu-repo/semantics/articleca
  • dc.type.version info:eu-repo/semantics/publishedVersionca