Millora de la naturalitat i l’expressivitat en la síntesi de veu

Rabasseda Riba, Jordi

Millora de la naturalitat i l’expressivitat en la síntesi de veu

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Rabasseda Riba, Jordica
dc.date.accessioned 2017-10-11T12:28:00Z
dc.date.available 2017-10-11T12:28:00Z
dc.date.issued 2017-10-11
dc.description Treball de fi de grau en Sistemes Audiovisualsca
dc.description Tutora: Mireia Farrús i Cabeceran
dc.description.abstract La interacció persona-màquina ha augmentat exponencialment durant els últims anys. Un exemple d’aquest creixement es pot veure en el camp del processament de la parla, i en concret en el de la síntesi de veu. Aquest treball va sobre el tractament de la síntesi de veu dins el marc del projecte KRISTINA, orientat al sector mèdic, on s’utilitzarà veu sintetitzada perquè els pacients amb dificultats de comprensió de la llengua autòctona del país on viuen s’hi puguin comunicar. Els sintetitzadors de veu (o Text-To-Speech (TTS) systems, en anglès) acostumen a ser monòtons i amb poca naturalitat. Un dels objectius d’aquest treball és aconseguir generar una millor veu per a l’avatar del projecte KRISTINA, dotant-la de més naturalitat i expressivitat, de manera que s’acosti al màxim a una veu humana. Hem treballat la prosòdia a nivell de paràgraf i a nivell d’estructura comunicativa, utilitzant diferents sintetitzadors de veu. Mitjançant l’anàlisi i l’aplicació de diversos llenguatges d’etiquetes de veu utilitzats per a modificar determinades característiques prosòdiques relacionades amb l’entonació, la durada i la intensitat, hem aconseguit millores en la naturalitat i l’expressivitat de la veu sintetitzada.ca
dc.description.abstract The human-computer interaction has increased exponentially in recent years. An example of this growth can be seen in the field of speech processing, specifically in speech synthesis. This work is about the treatment of speech synthesis within the framework of KRISTINA, a project oriented in the medical sector, which develop a conversational avatar to facilitate communication with patients with difficulties in understanding the native language of the country where they live. The majority of Text-to-Speech (TTS) voice synthesizers tend to be monotonous and with few naturalness, and one of the aims of this work is to create a better voice for KRISTINA avatar, providing it with more naturalness and expressiveness, approaching as close as possible to a human voice. We have worked prosody paragraph level and in terms of communicative structure, by using several speech synthesizers. By analyzing and applying different speech synthesis markup languages used to modify certain prosodic characteristics related to intonation, duration and intensity, we achieved improvements in naturalness and expressiveness of the synthesized voice.en
dc.format.mimetype application/pdfca
dc.identifier.uri http://hdl.handle.net/10230/32925
dc.language.iso catca
dc.rights Atribución-NoComercial-SinDerivadas 3.0 Españaca
dc.rights.accessRights info:eu-repo/semantics/openAccessca
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/ca
dc.subject.keyword Speech synthesis
dc.subject.keyword Text-to-speech
dc.subject.keyword Markup languages
dc.subject.other Processament de la parla
dc.title Millora de la naturalitat i l’expressivitat en la síntesi de veuca
dc.type info:eu-repo/semantics/bachelorThesisca

Col·leccions

Grau en Enginyeria de Sistemes Audiovisuals. Treballs de fi de grau