Discogs-VI: a musical version identification dataset based on public editorial metadata

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Araz, Recep Oguz
  • dc.contributor.author Serra, Xavier
  • dc.contributor.author Bogdanov, Dmitry
  • dc.date.accessioned 2025-02-25T07:25:21Z
  • dc.date.available 2025-02-25T07:25:21Z
  • dc.date.issued 2024
  • dc.description.abstract Current version identification (VI) datasets often lack sufficient size and musical diversity to train robust neural networks (NNs). Additionally, their non-representative clique size distributions prevent realistic system evaluations. To address these challenges, we explore the untapped potential of the rich editorial metadata in the Discogs music database and create a large dataset of musical versions containing about 1,900,000 versions across 348,000 cliques. Utilizing a high-precision search algorithm, we map this dataset to official music uploads on YouTube, resulting in a dataset of approximately 493,000 versions across 98,000 cliques. This dataset offers over nine times the number of cliques and over four times the number of versions than existing datasets. We demonstrate the utility of our dataset by training a baseline NN without extensive model complexities or data augmentations, which achieves competitive results on the SHS100K and Da-TACOS datasets. Our dataset, along with the tools used for its creation, the extracted audio features, and a trained model, are all publicly available online.
  • dc.description.sponsorship This work is supported by “IA y Musica: Cátedra en Inteligencia Artificial y Música” (TSI-100929-2023-1) funded by the Secretaría de Estado de Digitalización e Inteligencia Artificial and the European Union-Next Generation EU, under the program Cátedras ENIA.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Araz RO, Bogdanov D, Serra X. Discogs-VI: a musical version identification dataset based on public editorial metadata. In: Kaneshiro B, Mysore G, Nieto O, Donahue C, Huang CZA, Lee JH, McFee B, McCallum M, editors. Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR2024); 2024 November 10-14; San Francisco, USA. p. 478-85. DOI: https://doi.org/10.5281/zenodo.14877379
  • dc.identifier.doi https://doi.org/10.5281/zenodo.14877379
  • dc.identifier.uri http://hdl.handle.net/10230/69723
  • dc.language.iso eng
  • dc.publisher International Society for Music Information Retrieval (ISMIR)
  • dc.rights © R. O. Araz, X. Serra, and D. Bogdanov. Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: R. O. Araz, X. Serra, and D. Bogdanov, “Discogs-VI: A Musical Version Identification Dataset Based on Public Editorial Metadata”, in Proc. of the 25th Int. Society for Music Information Retrieval Conf., San Francisco, United States, 2024
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by/4.0/
  • dc.subject.keyword Discogs-VI
  • dc.subject.keyword musical dataset
  • dc.subject.keyword Metadata
  • dc.title Discogs-VI: a musical version identification dataset based on public editorial metadata
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/acceptedVersion