End-to-end music source separation: is it possible in the waveform domain?

Citació

  • Lluís F, Pons J, Serra X. End-to-end music source separation: is it possible in the waveform domain?. In: INTERSPEECH 2019: Proceedings of the Annual Conference of the International Speech Communication Association; 2019 Sep 15-19; Graz, Austria. Baixas: ISCA; 2019. p. 4619-23. DOI: 10.21437/Interspeech.2019-1177

Enllaç permanent

Descripció

  • Resum

    Most of the currently successful source separation techniques use the magnitude spectrogram as input, and are therefore by default omitting part of the signal: the phase. To avoid omitting potentially useful information, we study the viability of using end-to-end models for music source separation — which take into account all the information available in the raw audio signal, including the phase. Although during the last decades end-to-end music source separation has been considered almost unattainable, our results confirm that waveform-based models can perform similarly (if not better) than a spectrogram-based deep learning model. Namely: a Wavenet-based model we propose and Wave-U-Net can outperform DeepConvSep, a recent spectrogram-based deep learning model.
  • Descripció

    Comunicació presentada al INTERSPEECH 2019: The Annual Conference of the International Speech Communication Association celebrat del 15 al 19 de setembre de 2019 a Graz, Àustria.
  • Mostra el registre complet