Neural percussive synthesis parameterised by high-level timbral features

Citació

Ramires A, Chandna P, Favory X, Gómez E, Serra X. Neural percussive synthesis parameterised by high-level timbral features. In: 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP); 2020 May 4-8; Barcelona, Spain. New Jersery: The Institute of Electrical and Electronics Engineers; 2020. p. 786-90. DOI: 10.1109/ICASSP40776.2020.9053128

Enllaç permanent

Descripció

Resum
We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds without extensive knowledge of signal processing. We use a feedforward convolutional neural network-based architecture, which is able to map input parameters to the corresponding waveform. We propose two datasets to evaluate our approach on both a restrictive context, and in one covering a broader spectrum of sounds. The timbral features used as parameters are taken from recent literature in signal processing. We also use these features for evaluation and validation of the presented model, to ensure that changing the input parameters produces a congruent waveform with the desired characteristics. Finally, we evaluate the quality of the output sound using a subjective listening test. We provide sound examples and the system's source code for reproducibility.
Descripció
Comunicació presentada a: ICASSP 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, celebrat en línia del 4 al 8 de maig de 2020.
DOI
http://dx.doi.org/10.1109/ICASSP40776.2020.9053128
Col·leccions
Congressos (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)

Fitxers