Voice assignment in vocal quartets using deep learning models based on pitch salience

Cuesta, Helena; Gómez Gutiérrez, Emilia, 1975-

Voice assignment in vocal quartets using deep learning models based on pitch salience

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Cuesta, Helena
dc.contributor.author Gómez Gutiérrez, Emilia, 1975-
dc.date.accessioned 2023-01-20T07:52:15Z
dc.date.available 2023-01-20T07:52:15Z
dc.date.issued 2022
dc.description.abstract This paper deals with the automatic transcription of four-part, a cappella singing, audio performances. In particular, we exploit an existing, deep-learning based, multiple F0 estimation method and complement it with two neural network architectures for voice assignment (VA) in order to create a music transcription system that converts an input audio mixture into four pitch contours. To train our VA models, we create a novel synthetic dataset by collecting 5381 choral music scores from public-domain music archives, which we make publicly available for further research. We compare the performance of the proposed VA models on different types of input data, as well as to a hidden Markov model-based baseline system. In addition, we assess the generalization capabilities of these models on audio recordings with differing pitch distributions and vocal music styles. Our experiments show that the two proposed models, a CNN and a ConvLSTM, have very similar performance, and both of them outperform the baseline HMM-based system. We also observe a high confusion rate between the alto and tenor voice parts, which commonly have overlapping pitch ranges, while the bass voice has the highest scores in all evaluated scenarios.
dc.description.sponsorship This work is partially supported by the European Commission under the TROMPA project (H2020 770376), the Spanish Ministry of Science and Innovation under the Musical AI project (PID2019-111403GB-I00), and by AGAUR (Generalitat de Catalunya) through an FI Predoctoral Grant (2018FI-B01015).
dc.format.mimetype application/pdf
dc.identifier.citation Cuesta H, Gómez E. Voice assignment in vocal quartets using deep learning models based on pitch salience. Transactions of the International Society for Music Information Retrieval. 2022;5(1):99-112. DOI: 10.5334/tismir.121
dc.identifier.doi http://dx.doi.org/10.5334/tismir.121
dc.identifier.issn 2514-3298
dc.identifier.uri http://hdl.handle.net/10230/55358
dc.language.iso eng
dc.publisher Ubiquity Press
dc.relation.ispartof Transactions of the International Society for Music Information Retrieval. 2022;5(1):99-112.
dc.relation.isreferencedby https://github.com/helenacuesta/voas-vocal-quartets
dc.relation.isreferencedby https://doi.org/10.5334/tismir.121.s1
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/770376
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2019-111403GB-I00
dc.rights © 2022 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/.
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.subject.keyword voice assignment
dc.subject.keyword multi-pitch estimation
dc.subject.keyword music information retrieval
dc.subject.keyword vocal quartets
dc.subject.keyword polyphonic vocal music
dc.subject.keyword deep learning
dc.title Voice assignment in vocal quartets using deep learning models based on pitch salience
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Articles (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)