Multi-label music genre classification from audio, text and images using deep features

Oramas, Sergio; Nieto Caballero, Oriol; Barbieri, Francesco; Serra, Xavier

Multi-label music genre classification from audio, text and images using deep features

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Oramas, Sergioca
dc.contributor.author Nieto Caballero, Oriolca
dc.contributor.author Barbieri, Francescoca
dc.contributor.author Serra, Xavierca
dc.date.accessioned 2018-02-26T11:17:51Z
dc.date.available 2018-02-26T11:17:51Z
dc.date.issued 2017
dc.description Comunicació presentada a la ISMIR 2017: 18th International Society for Music Information Retrieval Conference, celebrada els dies 23 a 27 d'octubre de 2017 a Suzhou, Xina.
dc.description.abstract Music genres allow to categorize musical items that share common characteristics. Although these categories are not mutually exclusive, most related research is traditionally focused on classifying tracks into a single class. Furthermore, these categories (e.g., Pop, Rock) tend to be too broad for certain applications. In this work we aim to expand this task by categorizing musical items into multiple and fine-grained labels, using three different data modalities: audio, text, and images. To this end we present MuMu, a new dataset of more than 31k albums classified into 250 genre classes. For every album we have collected the cover image, text reviews, and audio tracks. Additionally, we propose an approach for multi-label genre classification based on the combination of feature embeddings learned with state-of-the-art deep learning methodologies. Experiments show major differences between modalities, which not only introduce new baselines for multi-label genre classification, but also suggest that combining them yields improved results.en
dc.description.sponsorship This work was partially funded by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502).
dc.format.mimetype application/pdf
dc.identifier.citation Oramas S, Nieto O, Barbieri F, Serra X. Multi-label music genre classification from audio, text and images using deep features. In: Hu X, Cunningham SJ, Turnbull D, Duan Z. ISMIR 2017. 18th International Society for Music Information Retrieval Conference; 2017 Oct 23-27; Suzhou, China. [Canada]: ISMIR; 2017. p. 23-30.
dc.identifier.uri http://hdl.handle.net/10230/33999
dc.language.iso eng
dc.publisher International Society for Music Information Retrieval (ISMIR)ca
dc.relation.ispartof Hu X, Cunningham SJ, Turnbull D, Duan Z. ISMIR 2017. 18th International Society for Music Information Retrieval Conference; 2017 Oct 23-27; Suzhou, China. [Canada]: ISMIR; 2017. p. 23-30.
dc.rights © Sergio Oramas, Oriol Nieto, Francesco Barbieri, Xavier Serra. Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: Sergio Oramas, Oriol Nieto, Francesco Barbieri, Xavier Serra. “Multi-label Music Genre Classification from audio, text, and images using Deep Features”, 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.subject.other Formes musicals
dc.subject.other Classificació automàtica
dc.title Multi-label music genre classification from audio, text and images using deep featuresca
dc.type info:eu-repo/semantics/conferenceObject
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)