Audio-visual gated-sequenced neural networks for affect recognition

Aspandi, Decky; Sukno, Federico Mateo; Schuller, Björn; Binefa i Valls, Xavier

Audio-visual gated-sequenced neural networks for affect recognition

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Aspandi, Decky
dc.contributor.author Sukno, Federico Mateo
dc.contributor.author Schuller, Björn
dc.contributor.author Binefa i Valls, Xavier
dc.date.accessioned 2024-06-05T11:06:32Z
dc.date.available 2024-06-05T11:06:32Z
dc.date.issued 2022
dc.description.abstract The interest in automatic emotion recognition and the larger field of Affective Computing has recently gained momentum. The current emergence of large, video-based affect datasets offering rich multi-modal inputs facilitates the development of deep learning-based models for automatic affect analysis that currently holds the state of the art. However, recent approaches to process these modalities cannot fully exploit them due to the use of oversimplified fusion schemes. Furthermore, the efficient use of temporal information inherent to these huge data are also largely unexplored hindering their potential progress. In this work, we propose a multi-modal, sequence-based neural network with gating mechanisms for Valence and Arousal based affect recognition. Our model consists of three major networks: Firstly, a latent-feature generator that extracts compact representations from both modalities that have been artificially degraded to add robustness. Secondly, a multi-task discriminator that estimates both input identity and a first step emotion quadrant estimation. Thirdly, a sequence-based predictor with attention and gating mechanisms that effectively merges both modalities and uses this information through sequence modelling. In our experiments on the SEMAINE and SEWA affect datasets, we observe the impact of both proposed methods with progressive increase in accuracy. We further show in our ablation studies how the internal attention weight and gating coefficient impact our models’ estimates quality. Finally, we demonstrate state of the art accuracy through comparisons with current alternatives on both datasets.
dc.description.sponsorship This work is partly supported by the Spanish Ministry of Science and Innovation under project grant PID2020-114083GB-I00, and the donation bahi2018-19 to the CMTech at UPF. Further funding has been received from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 826506 (sustAGE) along with the UDeco project by the German BMBF-KMU Innovativ program.
dc.format.mimetype application/pdf
dc.identifier.citation Aspandi D, Sukno F, Schuller BW, Binefa X. Audio-visual gated-sequenced neural networks for affect recognition. IEEE Trans Affect Comput. 2023;14(3):2193-208. DOI: 10.1109/TAFFC.2022.3156026
dc.identifier.doi http://dx.doi.org/10.1109/TAFFC.2022.3156026
dc.identifier.issn 1949-3045
dc.identifier.uri http://hdl.handle.net/10230/60358
dc.language.iso eng
dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof IEEE Trans Affect Comput. 2023;14(3):2193-208.
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/826506
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2020-114083GB-I00
dc.rights © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://dx.doi.org/10.1109/TAFFC.2022.3156026
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.subject.keyword Affective computing
dc.subject.keyword Deep learning
dc.subject.keyword Multi-modal fusion
dc.subject.keyword Sequence modelling
dc.title Audio-visual gated-sequenced neural networks for affect recognition
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/acceptedVersion

Col·leccions

Articles (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)