Show simple item record

dc.contributor.author Kleinhans, Janine
dc.contributor.author Farrús, Mireia
dc.contributor.author Gravano, Agustín
dc.contributor.author Pérez, Juan Manuel
dc.contributor.author Lai, Catherine
dc.contributor.author Wanner, Leo
dc.date.accessioned 2017-08-31T13:34:01Z
dc.date.available 2017-08-31T13:34:01Z
dc.date.issued 2017
dc.identifier.citation Kleinhans J, Farrús M, Gravano A, Pérez JM, Lai C, Wanner L. Using prosody to classify discourse relations. In: Proceedings of the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH 2017); 2017 Aug. 20-24; Stockholm, Sweden. Baixas: ISCA; 2017. p. 778-81. DOI: 10.21437/Interspeech.2017-710
dc.identifier.issn 1990-9772
dc.identifier.uri http://hdl.handle.net/10230/32717
dc.description Comunicació presentada a: The 18th Annual Conference of the International Speech Communication Association (INTERSPEECH 2017), celebrada a Estocolm, Suència, del 20 al 24 d'agost de 2017.
dc.description.abstract This work aims to explore the correlation between the discourse structure of a spoken monologue and its prosody by predicting discourse relations from different prosodic attributes. For this purpose, a corpus of semi-spontaneous monologues in English has been automatically annotated according to the Rhetorical Structure Theory, which models coherence in text via rhetorical relations. From corresponding audio files, prosodic features such as pitch, intensity, and speech rate have been extracted from different contexts of a relation. Supervised classification tasks using Support Vector Machines have been performed to find relationships between prosodic features and rhetorical relations. Preliminary results show that intensity combined with other features extracted from intra- and intersegmental environments is the feature with the highest predictability for a discourse relation. The prediction of rhetorical relations from prosodic features and their combinations is straightforwardly applicable to several tasks such as speech understanding or generation. Moreover, the knowledge of how rhetorical relations should be marked in terms of prosody will serve as a basis to improve speech synthesis applications and make voices sound more natural and expressive.
dc.description.sponsorship This work is part of the KRISTINA project, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the Grant Agreement number 645012. The second author is partially funded by the Spanish Ministry of Economy, Industry and Competitiveness through the Ramón y Cajal program. The third and fourth authors are partially funded by ANPCYT PICT 2014-1561, and the Air Force Office of Scientific Research, Air Force Material Command, USAF under Award No. FA9550-15-1-0055.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher International Speech Communication Association (ISCA)
dc.relation.ispartof Proceedings of the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH 2017); 2017 Aug. 20-24; Stockholm, Sweden. [place unknown]: ISCA; 2017. p. 778-81.
dc.rights © ISCA
dc.title Using prosody to classify discourse relations
dc.type info:eu-repo/semantics/conferenceObject
dc.identifier.doi http://dx.doi.org/10.21437/Interspeech.2017-710
dc.subject.keyword Prosody
dc.subject.keyword Discourse structure
dc.subject.keyword RST
dc.subject.keyword Speech synthesis
dc.subject.keyword Support vector machines
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/645012
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics

Compliant to Partaking