The Information structure–prosody interface in text-to-speech technologies: an empirical perspective
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Domínguez Bajo, Mónica
- dc.contributor.author Farrús, Mireia
- dc.contributor.author Wanner, Leo
- dc.date.accessioned 2021-04-01T10:07:14Z
- dc.date.issued 2021
- dc.description.abstract The correspondence between the communicative intention of a speaker in terms of Information Structure and the way this speaker reflects communicative aspects by means of prosody have been a fruitful field of study in Linguistics. However, text-to-speech applications still lack the variability and richness found in human speech in terms of how humans display their communication skills. Some attempts were made in the past to model one aspect of Information Structure, namely thematicity for its application to intonation generation in text-to-speech technologies. Yet these applications suffer from two limitations: (i) they draw upon a small number of made-up simple question-answer pairs rather than on real (spoken or written) corpus material; and (ii) they do not explore whether any other interpretation would better suit a wider range of textual genres beyond dialogues. In this paper, two different interpretations of thematicity in the field of speech technologies are examined: the state-of-art binary (and flat) theme-rheme, and the hierarchical thematicity defined by Igor Mel’ˇcuk within the Meaning-Text Theory. The outcome of the experiments on a corpus of native speakers of US English suggests that the latter interpretation of thematicity has a versatile implementation potential for text-to-speech applications of the Information Structure-–prosody interface
- dc.description.sponsorship This work has been partially funded by the European Commission in the context of its H2020 Programme under the contract numbers H2020-645012-RIA (KRISTINA) and H2020-870930-IA (WELCOME). The second author has been funded by the Agencia Estatal de Investigaci´on (AEI), Ministerio de Ciencia, Innovaci´on y Universidades and the Fondo Social Europeo (FSE), grant RYC2015-17239 (AEI/FSE, UE).
- dc.format.mimetype application/pdf
- dc.identifier.citation Domínguez M, Farrús M, Wanner L. The Information structure–prosody interface in text-to-speech technologies: an empirical perspective. Corpus Linguistics and Linguistic Theory. 2021 DOI: 10.1515/cllt-2020-0008
- dc.identifier.doi http://dx.doi.org/10.1515/cllt-2020-0008
- dc.identifier.issn 1613-7027
- dc.identifier.uri http://hdl.handle.net/10230/47009
- dc.language.iso eng
- dc.publisher De Gruyter
- dc.relation.ispartof Corpus Linguistics and Linguistic Theory. 2021
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/645012
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/870930
- dc.relation.projectID info:eu-repo/grantAgreement/ES/1PE/RYC2015-17239
- dc.rights © De Gruyter Published version available at https://www.degruyter.com/document/doi/10.1515/cllt-2020-0008 http://dx.doi.org/10.1515/cllt-2020-0008
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.subject.keyword Communicative structure
- dc.subject.keyword Information structure
- dc.subject.keyword Intonation
- dc.subject.keyword Prosody
- dc.subject.keyword Rheme
- dc.subject.keyword Specifier
- dc.subject.keyword Thematicity
- dc.subject.keyword Theme
- dc.subject.keyword ToBI
- dc.title The Information structure–prosody interface in text-to-speech technologies: an empirical perspective
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/acceptedVersion