Automatic alignment of long syllables in a cappella Beijing opera

Citació

Dzhambazov G, Yang Y, Caro R, Serra X. Automatic alignment of long syllables in a cappella Beijing opera. In: Beauguitte P, Duggan B, Kelleher J, editors. 6th International Workshop on Folk Music Analysis; 2016 Jun 15-17; Dublin, Ireland. [place unknown]: International Workshop on Folk Music Analysis; 2016. p. 88-91.

Enllaç permanent

Descripció

Resum
In this study we propose how to modify a standard approach for text-to-speech alignment to apply in the case of alignment of lyrics and singing voice. We model phoneme durations by means of a duration-explicit hidden Markov model (DHMM) phonetic recognizer based on MFCCs. The phoneme durations are empirically set in a probabilistic way, based on prior knowledge about the lyrics structure and metric principles, specific for the Beijing opera music tradition. Phoneme models are GMMs trained directly on a small corpus of annotated singing voice. The alignment is evaluated on a cappella material from Beijing opera, which is characterized by its particularly long syllable durations. Results show that the incorporation of music-specific knowledge results in a very high alignment accuracy, outperforming significantly a baseline HMM-based approach.
Descripció
Comunicació presentada al 6th International Workshop on Folk Music Analysis, celebrat els dies 15 a 17 de juny de 2016 a Dublín, Irlanda.
Col·leccions
Congressos (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)

Fitxers