Combining acoustic and linguistic features in phrase-oriented prosody prediction

Domínguez Bajo, Mónica; Farrús, Mireia; Wanner, Leo

Combining acoustic and linguistic features in phrase-oriented prosody prediction

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Domínguez Bajo, Mónicaca
dc.contributor.author Farrús, Mireiaca
dc.contributor.author Wanner, Leoca
dc.date.accessioned 2016-12-13T16:52:14Z
dc.date.available 2016-12-13T16:52:14Z
dc.date.issued 2016ca
dc.description Paper presented at Speech Prosody 8, 2016 May 31 - Jun 3; Boston, United States.en
dc.description.abstract Intonation is traditionally considered to be the most important prosodic feature, whereupon an important research effort has been devoted to automatic segmentation and labeling of speech samples to grasp intonation cues. A number of studies also show that when duration or intensity are incorporated, automatic prosody labeling is further improved. However, the combination of word level acoustic features still attains poor results when machine learning techniques are applied on annotated corpora to derive intonation for speech synthesis applications. To address this problem, we present an experimental set-up for the development of a hierarchical prosodic structure model which combines linguistic features, including information structure, and three acoustic elements (intensity, pitch and duration). We show empirically that this combination leads to a considerably more accurate representation of prosody and, consequently, a more reliable automatic labeling of speech corpora for machine learning.en
dc.description.sponsorship This work is part of a project that has received funding from the European Union’s Horizon 2020 Research and Innovation/nProgramme under the Grant Agreement number H2020-RIA-645012. The second author is partially funded by a grant from/nthe Spanish Ministry of Economy and Competitivity in the framework of the Juan de la Cierva fellowship program.en
dc.format.mimetype application/pdfca
dc.identifier.citation Domínguez M, Farrús M, Wanner L. Combining acoustic and linguistic features in phrase-oriented prosody prediction. In: Proceedings of Speech Prosody 8; 2016 May 31 - Jun 3; Boston, United States. [Boston]: ISCA, 2016. p. 796-800. DOI: 10.21437/SPEECHPROSODY.2016-163ca
dc.identifier.doi http://dx.doi.org/10.21437/SPEECHPROSODY.2016-163
dc.identifier.uri http://hdl.handle.net/10230/27754
dc.language.iso engca
dc.publisher International Speech Communication Association (ISCA)en
dc.relation.ispartof Proceedings of Speech Prosody 8; 2016 May 31 - Jun 3; Boston, United States. [Boston]: ISCA, 2016. p. 796-800.en
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/645012ca
dc.rights.accessRights info:eu-repo/semantics/openAccessca
dc.subject.keyword Information structureen
dc.subject.keyword Thematicityen
dc.subject.keyword Prosodic labelen
dc.subject.keyword Prosodic phraseen
dc.subject.keyword Prosodic worden
dc.subject.keyword ToBIen
dc.subject.keyword Hierarchical prosodic structureen
dc.subject.keyword Z-scoreen
dc.subject.keyword Acoustic parameteren
dc.title Combining acoustic and linguistic features in phrase-oriented prosody predictionca
dc.type info:eu-repo/semantics/conferenceObjectca
dc.type.version info:eu-repo/semantics/publishedVersionca

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)