Welcome to the UPF Digital Repository

Thematicity-based prosody enrichment for text-to-speech applications

Show simple item record

dc.contributor.author Domínguez Bajo, Mónica
dc.contributor.author Burga Díaz, Alicia
dc.contributor.author Farrús, Mireia
dc.contributor.author Wanner, Leo
dc.date.accessioned 2018-06-14T16:52:59Z
dc.date.available 2018-06-14T16:52:59Z
dc.date.issued 2018
dc.identifier.citation Domínguez M, Burga A, Farrús M, Wanner L. Thematicity-based prosody enrichment for text-to-speech applications. In: Klessa K, Bachan J, Wagner A, Karpiński M, Śledziński D. Proceedings of the 9th International Conference on Speech Prosody; 2018 June 13-16; Poznań, Poland. [Lous Tourils]: ISCA; 2018. p. 612-6. DOI: 10.21437/SpeechProsody.2018-119
dc.identifier.uri http://hdl.handle.net/10230/34905
dc.description Comunicació presentada a: the 9th International Conference on Speech Prosody 2018, celebrat del 13 al 16 de juny a Poznań, Polònia.
dc.description.abstract Theoretical studies on the information structure–prosody interface argue that the content packaged in terms of theme and rheme correlates with the intonation of the corresponding sentence as regards to rising and falling patterns (L*+H LH% and H* LL% respectively). When such a correspondence is used to derive prosody in text-to-speech applications, it is often the case that ToBI labels are statically mapped to acoustic parameters. Such an approach is insufficient to solve the problem of monotonous synthetic voices for two reasons: it is repetitive with respect to prosody enrichment, and a binary flat themerheme representation does not serve to describe properly long complex sentences. In this paper, we introduce a methodology for a more versatile thematicity-based prosody enrichment based on: (i) a hierarchical tripartite thematicity model as proposed in the Meaning–Text Theory, and (ii) a corpus-based approach for the automatic extraction of acoustic parameters (fundamental frequency, breaks and speech rate) that are mapped to a varied range of prosody control tags of the synthesized speech. Such a prosody enrichment has shown to provide higher results in a perception test when implemented in a TTS system.
dc.description.sponsorship This work is part of the KRISTINA project, which has received funding from the European Unions Horizon 2020 Research and Innovation Programme under the Grant Agreement number H2020-RIA-645012. It has been also partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM- 2015-0502). The second author is partially funded by the Spanish Ministry of Economy and Competitivity through the Ram´on y Cajal program.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher International Speech Communication Association (ISCA)
dc.relation.ispartof Klessa K, Bachan J, Wagner A, Karpiński M, Śledziński D. Proceedings of the 9th International Conference on Speech Prosody; 2018 June 13-16; Poznań, Poland. [Lous Tourils]: ISCA; 2018. p. 612-6.
dc.rights © 2018 ISCA.
dc.title Thematicity-based prosody enrichment for text-to-speech applications
dc.type info:eu-repo/semantics/conferenceObject
dc.identifier.doi http://dx.doi.org/10.21437/SpeechProsody.2018-119
dc.subject.keyword Prosody
dc.subject.keyword Information structure
dc.subject.keyword Theme
dc.subject.keyword Rheme
dc.subject.keyword TTS
dc.subject.keyword SSML
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/645012
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics

In collaboration with Compliant to Partaking