End­-to-­end learning for music audio tagging at scale

Pons Puig, Jordi; Nieto Caballero, Oriol; Prockup, Matthew; Schmidt, Erik M.; Ehmann, Andreas F.; Serra, Xavier

End-to-end learning for music audio tagging at scale

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Pons Puig, Jordi
dc.contributor.author Nieto Caballero, Oriol
dc.contributor.author Prockup, Matthew
dc.contributor.author Schmidt, Erik M.
dc.contributor.author Ehmann, Andreas F.
dc.contributor.author Serra, Xavier
dc.date.accessioned 2019-05-13T11:49:50Z
dc.date.available 2019-05-13T11:49:50Z
dc.date.issued 2017
dc.description Comunicació presentada a: Workshop Machine Learning for Audio Signal Processing at NIPS 2017 (ML4Audio@NIPS17) celebrat del 4 al 9 de desembre de 2017 a Long Beach, California.
dc.description.abstract The lack of data tends to limit the outcomes of deep learning research – specially, when dealing with end-to-end learning stacks processing raw data such as waveforms. In this study we make use of musical labels annotated for 1.2 million tracks. This large amount of data allows us to unrestrictedly explore different front-end paradigms: from assumption-free models – using waveforms as input with very small convolutional filters; to models that rely on domain knowledge – log-mel spectrograms with a convolutional neural network designed to learn temporal and timbral features. Results suggest that while spectrogram-based models surpass their waveform-based counterparts, the difference in performance shrinks as more data are employed.
dc.description.sponsorship This work is partially supported by the Maria de Maeztu Programme (MDM-2015-0502).
dc.format.mimetype application/pdf
dc.identifier.citation Pons J, Nieto O, Prockup M, Schmidt EM, Ehmann AF, Serra X. End-to-end learning for music audio tagging at scale. Paper presented at: Workshop Machine Learning for Audio Signal Processing at NIPS 2017 (ML4Audio@NIPS17); 2017 Dec 4-9; Long Beach, CA. [Copenhagen]: Sound & Music Computing; 2017. 5 p.
dc.identifier.uri http://hdl.handle.net/10230/37217
dc.language.iso eng
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.title End-to-end learning for music audio tagging at scale
dc.type info:eu-repo/semantics/conferenceObject
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)

End­-to-­end learning for music audio tagging at scale

Col·leccions

End-to-end learning for music audio tagging at scale