Welcome to the UPF Digital Repository

TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features

Show simple item record

dc.contributor.author Ronzano, Francesco
dc.contributor.author AbuRa’ed, Ahmed
dc.contributor.author Espinosa-Anke, Luis
dc.contributor.author Saggion, Horacio
dc.date.accessioned 2018-07-24T07:54:27Z
dc.date.available 2018-07-24T07:54:27Z
dc.date.issued 2016
dc.identifier.citation Ronzano F, Abura'ed A, Espinosa-Anke L. TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features. In: SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.
dc.identifier.uri http://hdl.handle.net/10230/35242
dc.description Comunicació presentada al 10th International Workshop on Semantic Evaluation (SemEval 2016), celebrat els dies 16 i 17 de juny de 2016 a San Diego, EUA.
dc.description.abstract This paper presents the participation of the TALN team in the Complex Word Identification Task of SemEval-2016 (Task 11). The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. To experiment with word complexity identification approaches, Task organizers provided a training set of 2,237 words judged as complex or not by 20 human evaluators, together with the sentence in which each word occurs. In our contribution we modelled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features that may help assess the complexity of a word. We trained a Random Forest classifier to automatically decide if each word is complex or not. We submitted two runs in which we respectively considered unweighted and weighted instances of complex words to train our classifier, where the weight of each instance is proportional to the number of evaluators that judged the word as complex. Our system scored as the third best performing one.
dc.description.sponsorship This work is partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502) and the ABLE-TO-INCLUDE Project (Competitivity and Innovation Programme of the European Commission, CIP-ICT-PSP-2013-7/621055).
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher ACL (Association for Computational Linguistics)
dc.relation.ispartof SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.
dc.rights © ACL, Creative Commons Attribution 4.0 License
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.subject.other Tractament del llenguatge natural (Informàtica)
dc.title TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features
dc.type info:eu-repo/semantics/conferenceObject
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics

Compliant to Partaking