CollFrEn: rich bilingual English–French collocation resource

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Fisas Elizalde, Beatriz
  • dc.contributor.author Espinosa-Anke, Luis
  • dc.contributor.author Codina Filbà, Joan
  • dc.contributor.author Wanner, Leo
  • dc.date.accessioned 2021-02-09T08:51:05Z
  • dc.date.available 2021-02-09T08:51:05Z
  • dc.date.issued 2020
  • dc.description Comunicació presentada a: Joint Workshop on Multiword Expressions and Electronic Lexicons celebrat el 13 de desembre de 2020 de manera virtual.
  • dc.description.abstract Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound words traditionally pose a challenge to language learners and many Natural Language Processing (NLP) applications alike. Reliable ground truth (i.e., ideally manually compiled) resources are thus of high value. We present a manually compiled bilingual English–French collocation resource with 7,480 collocations in English and 6,733 in French. Each collocation is enriched with information that facilitates its downstream exploitation in NLP tasks such as machine translation, word sense disambiguation, natural language generation, relation classification, and so forth. Our proposed enrichment covers: the semantic category of the collocation (its lexical function), its vector space representation (for each individual word as well as their joint collocation embedding), a subcategorization pattern of both its elements, as well as their corresponding BabelNet id, and finally, indices of their occurrences in large scale reference corpora.
  • dc.description.sponsorship This work has been supported by the European Commission in the context of its H2020 Program under the contract numbers 870930-RIA, 825079-STARTS, and 779962-RIA.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Fisas B, Espinosa Anke L,Codina-Filbà J, Wanner L. CollFrEn: rich bilingual English–French collocation resource. In: Markantonatou S, McCrae J, Mitrovic J, Tiberius C, Ramisch C, Vaidya A, Osenova P, Savary A, editors. Joint Workshop on Multiword Expressions and Electronic Lexicons; 2020 Dec 13; Barcelona, Spain. Stroudsburg PA: ACL; 2020. p. 1-12.
  • dc.identifier.isbn 978-1-952148-50-7
  • dc.identifier.uri http://hdl.handle.net/10230/46398
  • dc.language.iso eng
  • dc.publisher ACL (Association for Computational Linguistics)
  • dc.relation.ispartof Markantonatou S, McCrae J, Mitrovic J, Tiberius C, Ramisch C, Vaidya A, Osenova P, Savary A, editors. Joint Workshop on Multiword Expressions and Electronic Lexicons; 2020 Dec 13; Barcelona, Spain. Stroudsburg PA: ACL; 2020.
  • dc.relation.isreferencedby https://github.com/TalnUPF/CollFrEn
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/870930
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/825079
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/779962
  • dc.rights © ACL, Creative Commons Attribution 4.0 License
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri https://creativecommons.org/licenses/by/4.0/
  • dc.title CollFrEn: rich bilingual English–French collocation resource
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion