Paragraph-based prosodic cues for speech synthesis applications

Farrús, Mireia; Lai, Catherine; Moore, Johanna D.

Paragraph-based prosodic cues for speech synthesis applications

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Farrús, Mireiaca
dc.contributor.author Lai, Catherineca
dc.contributor.author Moore, Johanna D.ca
dc.date.accessioned 2016-12-22T18:46:33Z
dc.date.available 2016-12-22T18:46:33Z
dc.date.issued 2016ca
dc.description Paper presented at: Speech Prosody 2016; 2016 May 31-June 3; Boston (MA, USA)en
dc.description.abstract Speech synthesis has improved in both expressiveness and voice quality in recent years. However, obtaining full expressiveness when dealing with large multi-sentential synthesized discourse is still a challenge, since speech synthesizers do not take into account the prosodic differences that have been observed in discourse units such as paragraphs. The current study validates and extends previous work by analyzing the prosody of paragraph units in a large and diverse corpus of TED Talks using automatically extracted F0, intensity and timing features. In addition, a series of classification experiments was performed in order to identify which features are consistently used to distinguish paragraph breaks. The results show significant differences in prosody related to paragraph position. Moreover, the classification experiments show that boundary features such as pause duration and differences in F0 and intensity levels are the most consistent cues in marking paragraph boundaries. This suggests that these features should be taken into account when generating spoken discourse in order to improve naturalness and expressiveness.en
dc.description.sponsorship Part of this work has received funding from the EU’s Horizon 2020 Research and Innovation Programme under the GA H2020-RIA-645012. The first author is partially funded by the Spanish Ministry of Economy and Competitivity through the Juan de la Cierva program and a Jos´e Castillejo mobility granten
dc.format.mimetype application/pdfca
dc.identifier.citation Farrús M, Lai C, Moore JD. Paragraph-based prosodic cues for speech synthesis applications. In: Barnes J, Brugos A, Shattuck-Hufnagel S, Veilleux N, editors. Speech Prosody 2016; 2016 May 31-June 3; Boston (MA, USA). [place unknown]: International Speech Communication Association; 2016. p. 1143-7. DOI: 10.21437/SpeechProsody.2016-235ca
dc.identifier.doi http://dx.doi.org/10.21437/SpeechProsody.2016-235
dc.identifier.issn 2333-2042ca
dc.identifier.uri http://hdl.handle.net/10230/27837
dc.language.iso engca
dc.publisher International Speech Communication Associationca
dc.relation.ispartof Barnes J, Brugos A, Shattuck-Hufnagel S, Veilleux N, editors. Speech Prosody 2016; 2016 May 31-June 3; Boston (MA, USA). [place unknown]: International Speech Communication Association; 2016. p. 1143-7.
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/645012ca
dc.rights.accessRights info:eu-repo/semantics/openAccessca
dc.subject.keyword Discourse uniten
dc.subject.keyword Prosodic cueen
dc.subject.keyword Paragraph boundaryen
dc.subject.keyword Speech synthesisen
dc.title Paragraph-based prosodic cues for speech synthesis applicationsca
dc.type info:eu-repo/semantics/conferenceObjectca
dc.type.version info:eu-repo/semantics/publishedVersionca

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)