A Wavenet for speech denoising
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Rethage, Dario
- dc.contributor.author Pons Puig, Jordi
- dc.contributor.author Serra, Xavier
- dc.date.accessioned 2018-10-26T08:56:58Z
- dc.date.available 2018-10-26T08:56:58Z
- dc.date.issued 2018
- dc.description Comunicació presentada a la IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) celebrat a Calgary (Canada) del 15 al 20 d'abril de 2018.
- dc.description.abstract Most speech processing techniques use magnitude spectrograms as front-end and are therefore by default discarding part of the signal: the phase. In order to overcome this limitation' we propose an end-to-end learning method for speech denoising based on Wavenet. The proposed model adaptation retains Wavenet's powerful acoustic modeling capabilities, while significantly reducing its time-complexity by eliminating its autoregressive nature. Specifically, the model makes use of non-causal, dilated convolutions and predicts target fields instead of a single target sample. The discriminative adaptation of the model we propose, learns in a supervised fashion via minimizing a regression loss. These modifications make the model highly parallelizable during both training and inference. Both quantitative and qualitative evaluations indicate that the proposed method is preferred over Wiener filtering, a common method based on processing the magnitude spectrogram.
- dc.description.sponsorship This work is partially supported by the Maria de Maeztu Programme (MDM-2015-0502).
- dc.format.mimetype application/pdf
- dc.identifier.citation Rethage D, Pons J, Serra X. A Wavenet for speech denoising. In: Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing; 2018 Apr 15-20; Calgary, Canada. Piscataway: IEEE; 2018. p. 5069-73. DOI: 10.1109/ICASSP.2018.8462417
- dc.identifier.doi http://dx.doi.org/10.1109/ICASSP.2018.8462417
- dc.identifier.issn 2379-190X
- dc.identifier.uri http://hdl.handle.net/10230/35669
- dc.language.iso eng
- dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
- dc.relation.ispartof Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing; 2018 Apr 15-20; Calgary, Canada. Piscataway: IEEE; 2018.
- dc.relation.isreferencedby https://github.com/drethage/speech-denoising-wavenet
- dc.rights © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The final published article can be found at https://ieeexplore.ieee.org/document/8462417
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.title A Wavenet for speech denoising
- dc.type info:eu-repo/semantics/conferenceObject
- dc.type.version info:eu-repo/semantics/acceptedVersion