Designing efficient architectures for modeling temporal features with convolutional neural networks
Designing efficient architectures for modeling temporal features with convolutional neural networks
Citació
- Pons J, Serra X. Designing efficient architectures for modeling temporal features with convolutional neural networks. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2017 Mar 5-9; New Orleans, LA. Piscataway (NJ): IEEE; 2017. p. 2472-6. DOI: 10.1109/ICASSP.2017.7952601
Enllaç permanent
Descripció
Dades relacionades
Resum
Many researchers use convolutional neural networks with small rectangular filters for music (spectrograms) classification. First, we discuss why there is no reason to use this filters setup by default and second, we point that more efficient architectures could be implemented if the characteristics of the music features are considered during the design process. Specifically, we propose a novel design strategy that might promote more expressive and intuitive deep learning architectures by efficiently exploiting the representational capacity of the first layer - using different filter shapes adapted to fit musical concepts within the first layer. The proposed architectures are assessed by measuring their accuracy in predicting the classes of the Ballroom dataset. We also make available1 the used code (together with the audio-data) so that this research is fully reproducible.Descripció
Comunicació presentada a la 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), celebrada els dies 5 a 9 de març de 2017 a Nova Orleans, Louisiana (EUA).