Evaluating language models for the retrieval and categorization of lexical collocations
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Espinosa-Anke, Luis
- dc.contributor.author Codina Filbà, Joan
- dc.contributor.author Wanner, Leo
- dc.date.accessioned 2023-02-07T07:10:28Z
- dc.date.available 2023-02-07T07:10:28Z
- dc.date.issued 2021
- dc.description Comunicació presentada a: EACL 2021 celebrat del 19 a 23 d'abril de 2021 en línia.
- dc.description.abstract Lexical collocations are idiosyncratic combinations of two syntactically bound lexical items (e.g., “heavy rain”, “take a step” or “undergo surgery”). Understanding their degree of compositionality and idiosyncrasy, as well their underlying semantics, is crucial for language learners, lexicographers and downstream NLP applications alike. In this paper we analyse a suite of language models for collocation understanding. We first construct a dataset of apparitions of lexical collocations in context, categorized into 16 representative semantic categories. Then, we perform two experiments: (1) unsupervised collocate retrieval, and (2) supervised collocation classification in context. We find that most models perform well in distinguishing light verb constructions, especially if the collocation’s first argument acts as a subject, but often fail to distinguish, first, different syntactic structures within the same semantic category, and second, finer-grained categories which restrict the set of correct collocates.
- dc.description.sponsorship This work was partially supported by the European Commission via its H2020 Program under the contract number 870930.
- dc.format.mimetype application/pdf
- dc.identifier.citation Espinosa-Anke L, Codina-Filbà J, Wanner L. Evaluating language models for the retrieval and categorization of lexical collocations. In: Merlo P, Tiedemann J, Tsarfaty R, editors. The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021): proceedings of the conference; 2021 Apr 19-23; [online]. Stroudsburg: Association for Computational Linguistics; 2021. p. 1406-17. DOI: 10.18653/v1/2021.eacl-main.120
- dc.identifier.doi http://dx.doi.org/10.18653/v1/2021.eacl-main.120
- dc.identifier.uri http://hdl.handle.net/10230/55650
- dc.language.iso eng
- dc.publisher ACL (Association for Computational Linguistics)
- dc.relation.ispartof Merlo P, Tiedemann J, Tsarfaty R, editors. The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021): proceedings of the conference; 2021 Apr 19-23; [online]. Stroudsburg: Association for Computational Linguistics; 2021. p. 1406-17.
- dc.relation.isreferencedby https://github.com/luisespinosaanke/lexicalcollocations
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/870930
- dc.rights © ACL, Creative Commons Attribution 4.0 License
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.other Lexicologia
- dc.subject.other Semàntica
- dc.title Evaluating language models for the retrieval and categorization of lexical collocations
- dc.type info:eu-repo/semantics/conferenceObject
- dc.type.version info:eu-repo/semantics/publishedVersion