Paragraph prosodic patterns to enhance text-to-speech naturalness

Citació

  • Peiró-Lilja A, Farrús M. Paragraph prosodic patterns to enhance text-to-speech naturalness. In: Klessa K, Bachan J, Wagner A, Karpiński M, Śledziński D. Proceedings of the 9th International Conference on Speech Prosody; 2018 June 13-16; Poznań, Poland. [Lous Tourils]: ISCA; 2018. p. 512-6. DOI: 10.21437/SpeechProsody.2018-124

Enllaç permanent

Descripció

  • Resum

    Speech synthesis has reached a reasonable high quality in recent years. However, there is still room for improvement in terms of naturalness and expressiveness when dealing with large multisentential discourse, since most text-to-speech synthesizers do not fully take into account the prosodic differences that have been observed in discourse units such as paragraphs. This work presents an implementation of paragraph-based prosodic patterns into the open-source MARYTTS platform, enriching its prosody output by means of intra- and inter-paragraph prosodic features. The set of characteristics include pitch decay, pitch range and speech rate variation (as intra-paragraph features), as well as paragraph break pauses and speech rate variation (as inter-paragraph features), previously analyzed in a large set of TED Talks and read-speech sections of the Spoken Wikipedia Corpus. The perception tests, performed both in English and German parametric voices, suggest that paragraph-based features should be further studied and taken into account on future implementations to synthesize large discourse speech.
  • Descripció

    Comunicació presentada a: the 9th International Conference on Speech Prosody 2018, celebrat del 13 al 16 de juny a Poznań, Polònia.
  • Mostra el registre complet