Won, MinzChun, SanghyukNieto Caballero, Oriol2020-04-202020Won M, Chun S, Nieto O, Serra X. Data-driven harmonic filters for audio representation learning. In: 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing Proceedings; 2020 May 4-8; Barcelona, Spain. [New York]: IEEE; 2020. p. 536-40. DOI: 10.1109/ICASSP40776.2020.9053669978-1-5090-6631-52379-190Xhttp://hdl.handle.net/10230/44278Comunicació presentada a: ICASSP 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing, celebrat en línia del 4 al 8 de maig de 2020.We introduce a trainable front-end module for audio representation learning that exploits the inherent harmonic structure of audio signals. The proposed architecture, composed of a set of filters, compels the subsequent network to capture harmonic relations while preserving spectro-temporal locality. Since the harmonic structure is known to have a key role in human auditory perception, one can expect these harmonic filters to yield more efficient audio representations. Experimental results show that a simple convolutional neural network back-end with the proposed front-end outperforms state-of-the-art baseline methods in automatic music tagging, keyword spotting, and sound event tagging tasks.application/pdfeng© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://dx.doi.org/10.1109/ICASSP40776.2020.9053669Data-driven harmonic filters for audio representation learninginfo:eu-repo/semantics/conferenceObjecthttp://dx.doi.org/10.1109/ICASSP40776.2020.9053669Harmonic filtersAudio representation learningDeep learninginfo:eu-repo/semantics/openAccess