TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Ronzano, Francescoca
- dc.contributor.author AbuRa'ed, Ahmed Ghassan Tawfiqca
- dc.contributor.author Espinosa-Anke, Luisca
- dc.contributor.author Saggion, Horacioca
- dc.date.accessioned 2018-07-24T07:54:27Z
- dc.date.available 2018-07-24T07:54:27Z
- dc.date.issued 2016
- dc.description Comunicació presentada al 10th International Workshop on Semantic Evaluation (SemEval 2016), celebrat els dies 16 i 17 de juny de 2016 a San Diego, EUA.
- dc.description.abstract This paper presents the participation of the TALN team in the Complex Word Identification Task of SemEval-2016 (Task 11). The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. To experiment with word complexity identification approaches, Task organizers provided a training set of 2,237 words judged as complex or not by 20 human evaluators, together with the sentence in which each word occurs. In our contribution we modelled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features that may help assess the complexity of a word. We trained a Random Forest classifier to automatically decide if each word is complex or not. We submitted two runs in which we respectively considered unweighted and weighted instances of complex words to train our classifier, where the weight of each instance is proportional to the number of evaluators that judged the word as complex. Our system scored as the third best performing one.en
- dc.description.sponsorship This work is partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502) and the ABLE-TO-INCLUDE Project (Competitivity and Innovation Programme of the European Commission, CIP-ICT-PSP-2013-7/621055).
- dc.format.mimetype application/pdf
- dc.identifier.citation Ronzano F, Abura'ed A, Espinosa-Anke L. TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features. In: SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.
- dc.identifier.uri http://hdl.handle.net/10230/35242
- dc.language.iso eng
- dc.publisher ACL (Association for Computational Linguistics)ca
- dc.relation.ispartof SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.
- dc.rights © ACL, Creative Commons Attribution 4.0 License
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.other Tractament del llenguatge natural (Informàtica)
- dc.title TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic featuresca
- dc.type info:eu-repo/semantics/conferenceObject
- dc.type.version info:eu-repo/semantics/publishedVersion