The Information structure–prosody interface in text-to-speech technologies: an empirical perspective

Domínguez Bajo, Mónica; Farrús, Mireia; Wanner, Leo

The Information structure–prosody interface in text-to-speech technologies: an empirical perspective

Citació

Domínguez M, Farrús M, Wanner L. The Information structure–prosody interface in text-to-speech technologies: an empirical perspective. Corpus Linguistics and Linguistic Theory. 2021 DOI: 10.1515/cllt-2020-0008

Enllaç permanent

http://hdl.handle.net/10230/47009

Descripció

Resum
The correspondence between the communicative intention of a speaker in terms of Information Structure and the way this speaker reflects communicative aspects by means of prosody have been a fruitful field of study in Linguistics. However, text-to-speech applications still lack the variability and richness found in human speech in terms of how humans display their communication skills. Some attempts were made in the past to model one aspect of Information Structure, namely thematicity for its application to intonation generation in text-to-speech technologies. Yet these applications suffer from two limitations: (i) they draw upon a small number of made-up simple question-answer pairs rather than on real (spoken or written) corpus material; and (ii) they do not explore whether any other interpretation would better suit a wider range of textual genres beyond dialogues. In this paper, two different interpretations of thematicity in the field of speech technologies are examined: the state-of-art binary (and flat) theme-rheme, and the hierarchical thematicity defined by Igor Mel’ˇcuk within the Meaning-Text Theory. The outcome of the experiments on a corpus of native speakers of US English suggests that the latter interpretation of thematicity has a versatile implementation potential for text-to-speech applications of the Information Structure-–prosody interface
DOI
http://dx.doi.org/10.1515/cllt-2020-0008
Col·leccions
Articles (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)

Mostra el registre complet

The Information structure–prosody interface in text-to-speech technologies: an empirical perspective

The Information structure–prosody interface in text-to-speech technologies: an empirical perspective

Fitxers

Data

Autories

Resum

DOI

Col·leccions