Transfer learning from speech to music: towards language-sensitive emotion recognition models

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Gómez Cañón, Juan Sebastián
  • dc.contributor.author Cano, Estefanía
  • dc.contributor.author Herrera Boyer, Perfecto, 1964-
  • dc.contributor.author Gómez Gutiérrez, Emilia, 1975-
  • dc.date.accessioned 2021-03-08T10:49:24Z
  • dc.date.issued 2021
  • dc.description Comunicació presentada a: 28th European Signal Processing Conference (EUSIPCO 2020) celebrat del 18 al 22 de gener de 2021 a Amsterdam, Països Baixos.
  • dc.description.abstract In this study, we address emotion recognition using unsupervised feature learning from speech data, and test its transferability to music. Our approach is to pre-train models using speech in English and Mandarin, and then fine-tune them with excerpts of music labeled with categories of emotion. Our initial hypothesis is that features automatically learned from speech should be transferable to music. Namely, we expect the intra-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in English) should result in improved performance over the cross-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in Mandarin). Our results confirm previous research on cross-domain transferability, and encourage research towards language-sensitive Music Emotion Recognition (MER) models.
  • dc.description.sponsorship The research work conducted in the Music Technology Group at the Universitat Pompeu Fabra is partially supported by the European Commission under the TROMPA project (H2020 770376).
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Gómez-Cañón JS, Cano E, Herrera P, Gómez E. Transfer learning from speech to music: towards language-sensitive emotion recognition models. In: 28th European Signal Processing Conference (EUSIPCO 2020), Proceedings; 2021 Jan 18-22; Amsterdam, The Netherlands;2021. p. 136-40. DOI: 10.23919/Eusipco47968.2020.9287548
  • dc.identifier.doi http://dx.doi.org/10.23919/Eusipco47968.2020.9287548
  • dc.identifier.isbn 978-9-0827-9705-3
  • dc.identifier.issn 2076-1465
  • dc.identifier.uri http://hdl.handle.net/10230/46695
  • dc.language.iso eng
  • dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
  • dc.relation.ispartof 28th European Signal Processing Conference (EUSIPCO 2020), Proceedings; 2021 Jan 18-22; Amsterdam, The Netherlands;2021.
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/770376
  • dc.rights © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://dx.doi.org/10.23919/Eusipco47968.2020.9287548
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.subject.keyword Sparse convolutional autoencoder
  • dc.subject.keyword Speech emotion recognition
  • dc.subject.keyword Music emotion recognition
  • dc.subject.keyword Unsupervised learning
  • dc.subject.keyword Transfer learning
  • dc.subject.keyword Multi-task learning
  • dc.title Transfer learning from speech to music: towards language-sensitive emotion recognition models
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/acceptedVersion