dc.contributor.author |
Fisas Elizalde, Beatriz |
dc.contributor.author |
Espinosa-Anke, Luis |
dc.contributor.author |
Codina Filbà, Joan |
dc.contributor.author |
Wanner, Leo |
dc.date.accessioned |
2021-02-09T08:51:05Z |
dc.date.available |
2021-02-09T08:51:05Z |
dc.date.issued |
2020 |
dc.identifier.citation |
Fisas B, Espinosa Anke L,Codina-Filbà J, Wanner L. CollFrEn: rich bilingual English–French collocation resource. In: Markantonatou S, McCrae J, Mitrovic J, Tiberius C, Ramisch C, Vaidya A, Osenova P, Savary A, editors. Joint Workshop on Multiword Expressions and Electronic Lexicons; 2020 Dec 13; Barcelona, Spain. Stroudsburg PA: ACL; 2020. p. 1-12. |
dc.identifier.isbn |
978-1-952148-50-7 |
dc.identifier.uri |
http://hdl.handle.net/10230/46398 |
dc.description |
Comunicació presentada a: Joint Workshop on Multiword Expressions and Electronic Lexicons celebrat el 13 de desembre de 2020 de manera virtual. |
dc.description.abstract |
Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound
words traditionally pose a challenge to language learners and many Natural Language Processing
(NLP) applications alike. Reliable ground truth (i.e., ideally manually compiled) resources are
thus of high value. We present a manually compiled bilingual English–French collocation resource
with 7,480 collocations in English and 6,733 in French. Each collocation is enriched with
information that facilitates its downstream exploitation in NLP tasks such as machine translation,
word sense disambiguation, natural language generation, relation classification, and so forth. Our
proposed enrichment covers: the semantic category of the collocation (its lexical function), its
vector space representation (for each individual word as well as their joint collocation embedding),
a subcategorization pattern of both its elements, as well as their corresponding BabelNet
id, and finally, indices of their occurrences in large scale reference corpora. |
dc.description.sponsorship |
This work has been supported by the European Commission in the context
of its H2020 Program under the contract numbers 870930-RIA, 825079-STARTS, and 779962-RIA. |
dc.format.mimetype |
application/pdf |
dc.language.iso |
eng |
dc.publisher |
ACL (Association for Computational Linguistics) |
dc.relation.ispartof |
Markantonatou S, McCrae J, Mitrovic J, Tiberius C, Ramisch C, Vaidya A, Osenova P, Savary A, editors. Joint Workshop on Multiword Expressions and Electronic Lexicons; 2020 Dec 13; Barcelona, Spain. Stroudsburg PA: ACL; 2020. |
dc.relation.isreferencedby |
https://github.com/TalnUPF/CollFrEn |
dc.rights |
© ACL, Creative Commons Attribution 4.0 License |
dc.rights.uri |
https://creativecommons.org/licenses/by/4.0/ |
dc.title |
CollFrEn: rich bilingual English–French collocation resource |
dc.type |
info:eu-repo/semantics/conferenceObject |
dc.relation.projectID |
info:eu-repo/grantAgreement/EC/H2020/870930 |
dc.relation.projectID |
info:eu-repo/grantAgreement/EC/H2020/825079 |
dc.relation.projectID |
info:eu-repo/grantAgreement/EC/H2020/779962 |
dc.rights.accessRights |
info:eu-repo/semantics/openAccess |
dc.type.version |
info:eu-repo/semantics/publishedVersion |