Data efficient voice cloning for neural singing synthesis

Citació

Blaauw M, Bonada J, Daido R. Data efficient voice cloning for neural singing synthesis. In: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2019 May 12-17; Brighton, United Kingdom. New Jersey: Institute of Electrical and Electronics Engineers; 2019. p. 6840-4. DOI: 10.1109/ICASSP.2019.8682656

Enllaç permanent

Descripció

Resum
There are many use cases in singing synthesis where creating voices from small amounts of data is desirable. In text-to-speech there have been several promising results that apply voice cloning techniques to modern deep learning based models. In this work, we adapt one such technique to the case of singing synthesis. By leveraging data from many speakers to first create a multispeaker model, small amounts of target data can then efficiently adapt the model to new unseen voices. We evaluate the system using listening tests across a number of different use cases, languages and kinds of data.
Descripció
Comunicació presentada al IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), celebrat els dies 12 al 17 de 2019 a Brighton, Anglaterra.
DOI
http://dx.doi.org/10.1109/ICASSP.2019.8682656
Col·leccions
Congressos (Departament de Tecnologies de la Informació i les Comunicacions)

Fitxers