Deep learning based source separation applied to choir ensembles

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Petermann, Darius
  • dc.contributor.author Chandna, Pritish
  • dc.contributor.author Cuesta, Helena
  • dc.contributor.author Bonada, Jordi, 1973-
  • dc.contributor.author Gómez Gutiérrez, Emilia, 1975-
  • dc.date.accessioned 2020-11-11T07:23:58Z
  • dc.date.available 2020-11-11T07:23:58Z
  • dc.date.issued 2020
  • dc.description Comunicació presentada a: International Society for Music Information Retrieval Conference celebrat de l'11 al 16 d'octubre de 2020 de manera virtual.
  • dc.description.abstract Choral singing is a widely practiced form of ensemble singing wherein a group of people sing simultaneously in polyphonic harmony. The most commonly practiced setting for choir ensembles consists of four parts; Soprano, Alto, Tenor and Bass (SATB), each with its own range of fundamental frequencies (F0s). The task of source separation for this choral setting entails separating the SATB mixture into the constituent parts. Source separation for musical mixtures is well studied and many deep learning based methodologies have been proposed for the same. However, most of the research has been focused on a typical case which consists in separating vocal, percussion and bass sources from a mixture, each of which has a distinct spectral structure. In contrast, the simultaneous and harmonic nature of ensemble singing leads to high structural similarity and overlap between the spectral components of the sources in a choral mixture, making source separation for choirs a harder task than the typical case. This, along with the lack of an appropriate consolidated dataset has led to a dearth of research in the field so far. In this paper we first assess how well some of the recently developed methodologies for musical source separation perform for the case of SATB choirs. We then propose a novel domain-specific adaptation for conditioning the recently proposed U-Net architecture for musical source separation using the fundamental frequency contour of each of the singing groups and demonstrate that our proposed approach surpasses results from domain-agnostic architectures.en
  • dc.description.sponsorship The TITANX used for this research was donated by the NVIDIA Corporation. This work is partially supported by the Towards Richer Online Music Public-domain Archives (TROMPA H2020 770376) project. Helena Cuesta is supported by the FI Predoctoral Grant from AGAUR (Generalitat de Catalunya). The authors would like to thank Rodrigo Schramm and Emmanouil Benetos for sharing their singing voice datasets for this research.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Petermann D, Chandna P, Cuesta H, Bonada J, Gómez E. Deep learning based source separation applied to choir ensembles. In: Cumming J, Ha Lee J, McFee B, Schedl M, Devaney J, McKay C, Zagerle E, de Reuse T, editors. Proceedings of the 21st International Society for Music Information Retrieval Conference; 2020 Oct 11-16; Montréal, Canada. [Canada]: ISMIR; 2020. p. 733-9.
  • dc.identifier.uri http://hdl.handle.net/10230/45713
  • dc.language.iso eng
  • dc.publisher International Society for Music Information Retrieval (ISMIR)
  • dc.relation.ispartof Cumming J, Ha Lee J, McFee B, Schedl M, Devaney J, McKay C, Zagerle E, de Reuse T, editors. Proceedings of the 21st International Society for Music Information Retrieval Conference; 2020 Oct 11-16; Montréal, Canada. [Canada]: ISMIR; 2020. p. 733-9
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/770376
  • dc.rights © D. Petermann, P. Chandna, H. Cuesta, J. Bonada, and E. Gómez. Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: D. Petermann, P. Chandna, H. Cuesta, J. Bonada, and E. Gómez, “Deep Learning Based Source Separation Applied To Choir Ensembles”, in Proc. of the 21st Int. Society for Music Information Retrieval Conf., Montréal, Canada, 2020.
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri https://creativecommons.org/licenses/by/4.0/
  • dc.title Deep learning based source separation applied to choir ensemblesen
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion