Data-driven harmonic filters for audio representation learning
Data-driven harmonic filters for audio representation learning
Citació
- Won M, Chun S, Nieto O, Serra X. Data-driven harmonic filters for audio representation learning. In: 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing Proceedings; 2020 May 4-8; Barcelona, Spain. [New York]: IEEE; 2020. p. 536-40. DOI: 10.1109/ICASSP40776.2020.9053669
Enllaç permanent
Descripció
Resum
We introduce a trainable front-end module for audio representation learning that exploits the inherent harmonic structure of audio signals. The proposed architecture, composed of a set of filters, compels the subsequent network to capture harmonic relations while preserving spectro-temporal locality. Since the harmonic structure is known to have a key role in human auditory perception, one can expect these harmonic filters to yield more efficient audio representations. Experimental results show that a simple convolutional neural network back-end with the proposed front-end outperforms state-of-the-art baseline methods in automatic music tagging, keyword spotting, and sound event tagging tasks.Descripció
Comunicació presentada a: ICASSP 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing, celebrat en línia del 4 al 8 de maig de 2020.