The relation dimension in the identification and classification of lexically restricted word co-occurrences in text corpora
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Shvets, Alexander
- dc.contributor.author Wanner, Leo
- dc.date.accessioned 2023-01-20T07:52:11Z
- dc.date.available 2023-01-20T07:52:11Z
- dc.date.issued 2022
- dc.description.abstract The speech of native speakers is full of idiosyncrasies. Especially prominent are lexically restricted binary word co-occurrences of the type high esteem, strong tea, run [an] experiment, war break(s) out, etc. In lexicography, such co-occurrences are referred to as collocations. Due to their semi-decompositional nature, collocations are of high relevance to a large number of natural language processing applications as well as to second language learning. A substantial body of work exists on the automatic recognition of collocations in textual material and, increasingly also on their semantic classification, even if not yet in the mainstream research. Especially classification with respect to the lexical function (LF) taxonomy, which is the most detailed semantically oriented taxonomy of collocations available to date, proved to be of real use to human speakers and machines alike. The most recent approaches in the field are based on multilingual neural graph transformer models that use explicit syntactic dependencies. Our goal is to explore whether the extension of such a model by a semantic relation extraction network improves its classification performance or whether it already learns the corresponding semantic relations from the dependencies and the sentential contexts, such that an additional relation extraction network will not improve the overall performance. The experiments show that the semantic relation extraction layer indeed improves the overall performance of a graph transformer. However, this improvement is not very significant, such that we can conclude that graph transformers already learn to a certain extent the semantics of the dependencies between the collocation elements.
- dc.description.sponsorship This research was funded by the European Commission in the context of its H2020 Research and Development Program under the contract number 870930.
- dc.format.mimetype application/pdf
- dc.identifier.citation Shvets A, Wanner L. The relation dimension in the identification and classification of lexically restricted word co-occurrences in text corpora. Mathematics. 2022;10(20):3831. DOI: 10.3390/math10203831
- dc.identifier.doi http://dx.doi.org/10.3390/math10203831
- dc.identifier.issn 2227-7390
- dc.identifier.uri http://hdl.handle.net/10230/55357
- dc.language.iso eng
- dc.publisher MDPI
- dc.relation.ispartof Mathematics. 2022;10(20):3831.
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/870930
- dc.rights © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri https://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword idiosyncratic word co-occurrences
- dc.subject.keyword collocations
- dc.subject.keyword lexical functions
- dc.subject.keyword multilingual
- dc.subject.keyword graph transformers
- dc.subject.keyword multitask learning
- dc.subject.keyword semantic relation extraction
- dc.title The relation dimension in the identification and classification of lexically restricted word co-occurrences in text corpora
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion