Carnatic singing voice separation using cold diffusion on training data with bleeding

Plaja-Roglans, Genís; Miron, Marius; Shankar, Adithi; Serra, Xavier

Carnatic singing voice separation using cold diffusion on training data with bleeding

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Plaja-Roglans, Genís
dc.contributor.author Miron, Marius
dc.contributor.author Shankar, Adithi
dc.contributor.author Serra, Xavier
dc.date.accessioned 2023-10-30T16:47:10Z
dc.date.available 2023-10-30T16:47:10Z
dc.date.issued 2023-10-30
dc.description This work has been accepted at the 24th International Society for Music Information Retrieval Conference (ISMIR 2023), at Milan, Italy. October 5-9, 2023.
dc.description.abstract Supervised music source separation systems using deep learning are trained by minimizing a loss function between pairs of predicted separations and ground-truth isolated sources. However, open datasets comprising isolated sources are few, small, and restricted to a few music styles. At the same time, multi-track datasets with source bleeding are usually found larger in size, and are easier to compile. In this work, we address the task of singing voice separation when the ground-truth signals have bleeding and only the target vocals and the corresponding mixture are available. We train a cold diffusion model on the frequency domain to iteratively transform a mixture into the corresponding vocals with bleeding. Next, we build the final separation masks by clustering spectrogram bins according to their evolution along the transformation steps. We test our approach on a Carnatic music scenario for which solely datasets with bleeding exist, while current research on this repertoire commonly uses source separation models trained solely with Western commercial music. Our evaluation on a Carnatic test set shows that our system improves Spleeter on interference removal and it is competitive in terms of signal distortion. Code is open sourced.ca
dc.description.sponsorship This work was carried out under the projects Musical AI - PID2019-111403GB-I00/AEI/10.13039/501100011033 funded by the Spanish Ministerio de Ciencia, Innovación y Universidades (MCIU) and the Agencia Estatal de Investigación (AEI).
dc.format.mimetype application/pdf*
dc.identifier.uri http://hdl.handle.net/10230/58188
dc.language.iso engca
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2019-111403GB-I00
dc.rights.accessRights info:eu-repo/semantics/openAccessca
dc.rights.uri https://creativecommons.org/licenses/by/4.0ca
dc.title Carnatic singing voice separation using cold diffusion on training data with bleedingca
dc.type info:eu-repo/semantics/preprintca
dc.type.version info:eu-repo/semantics/submittedVersionca

Col·leccions

Informes (Departament de Tecnologies de la Informació i les Comunicacions)