Challenges in including extra-linguistic context in pre-trained language models

Citació

Sorodoc IT, Aina L, Boleda G. Challenges in including extra-linguistic context in pre-trained language models. In: Tafreshi S, Sedoc J, Rogers A, Drozd A, Rumshisky A, Akula A, editors. The Third Workshop on Insights from Negative Results in NLP (Insights 2022): proceedings of the Workshop; 2022 May 26; Dublin, Ireland. [Stroudsburg]: Association for Computational Linguistics; 2022. p. 134-8. DOI: 10.18653/v1/2022.insights-1.18

Enllaç permanent

Descripció

Resum
To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pretrained language models.
Descripció
Comunicació presentada a Third Workshop on Insights from Negative Results in NLP (Insights 2022), celebrat el 26 de maig de 2022 a Dublín, Irlanda.
DOI
http://dx.doi.org/10.18653/v1/2022.insights-1.18
Col·leccions
Congressos (Departament de Traducció i Ciències del Llenguatge)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)

Fitxers