A neural parametric singing synthesizer modeling timbre and expression from natural songs

Blaauw, Merlijn; Bonada, Jordi, 1973-

A neural parametric singing synthesizer modeling timbre and expression from natural songs

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Blaauw, Merlijn
dc.contributor.author Bonada, Jordi, 1973-
dc.date.accessioned 2019-05-23T14:33:36Z
dc.date.available 2019-05-23T14:33:36Z
dc.date.issued 2017
dc.description.abstract We recently presented a new model for singing synthesis based on a modified version of the WaveNet architecture. Instead of modeling raw waveform, we model features produced by a parametric vocoder that separates the influence of pitch and timbre. This allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and generation times. Nonetheless, compared to modeling waveform directly, ways of effectively handling higher-dimensional outputs, multiple feature streams and regularization become more important with our approach. In this work, we extend our proposed system to include additional components for predicting F0 and phonetic timings from a musical score with lyrics. These expression-related features are learned together with timbrical features from a single set of natural songs. We compare our method to existing statistical parametric, concatenative, and neural network-based approaches using quantitative metrics as well as listening tests.
dc.description.sponsorship This work is partially supported by the Spanish Ministry of Economy and Competitiveness under the CASAS project (TIN2015-70816-R).
dc.format.mimetype application/pdf
dc.identifier.citation Blaauw M, Bonada J. A neural parametric singing synthesizer modeling timbre and expression from natural songs. Appl Sci. 2017;7(1313): 23 p. DOI: 10.3390/app7121313
dc.identifier.doi http://dx.doi.org/10.3390/app7121313
dc.identifier.issn 2076-3417
dc.identifier.uri http://hdl.handle.net/10230/37284
dc.language.iso eng
dc.publisher MDPI
dc.relation.ispartof Applied Sciences. 2017;7(1313): 23 p.
dc.relation.projectID info:eu-repo/grantAgreement/ES/1PE/TIN2015-70816-R
dc.rights © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.subject.keyword Singing synthesis
dc.subject.keyword Machine learning
dc.subject.keyword Deep learning
dc.subject.keyword Conditional generative models
dc.subject.keyword Autoregressive models
dc.title A neural parametric singing synthesizer modeling timbre and expression from natural songs
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Articles (Departament de Tecnologies de la Informació i les Comunicacions)