TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features

Ronzano, Francesco; AbuRa'ed, Ahmed Ghassan Tawfiq; Espinosa-Anke, Luis; Saggion, Horacio

TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Ronzano, Francescoca
dc.contributor.author AbuRa'ed, Ahmed Ghassan Tawfiqca
dc.contributor.author Espinosa-Anke, Luisca
dc.contributor.author Saggion, Horacioca
dc.date.accessioned 2018-07-24T07:54:27Z
dc.date.available 2018-07-24T07:54:27Z
dc.date.issued 2016
dc.description Comunicació presentada al 10th International Workshop on Semantic Evaluation (SemEval 2016), celebrat els dies 16 i 17 de juny de 2016 a San Diego, EUA.
dc.description.abstract This paper presents the participation of the TALN team in the Complex Word Identification Task of SemEval-2016 (Task 11). The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. To experiment with word complexity identification approaches, Task organizers provided a training set of 2,237 words judged as complex or not by 20 human evaluators, together with the sentence in which each word occurs. In our contribution we modelled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features that may help assess the complexity of a word. We trained a Random Forest classifier to automatically decide if each word is complex or not. We submitted two runs in which we respectively considered unweighted and weighted instances of complex words to train our classifier, where the weight of each instance is proportional to the number of evaluators that judged the word as complex. Our system scored as the third best performing one.en
dc.description.sponsorship This work is partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502) and the ABLE-TO-INCLUDE Project (Competitivity and Innovation Programme of the European Commission, CIP-ICT-PSP-2013-7/621055).
dc.format.mimetype application/pdf
dc.identifier.citation Ronzano F, Abura'ed A, Espinosa-Anke L. TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features. In: SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.
dc.identifier.uri http://hdl.handle.net/10230/35242
dc.language.iso eng
dc.publisher ACL (Association for Computational Linguistics)ca
dc.relation.ispartof SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.subject.other Tractament del llenguatge natural (Informàtica)
dc.title TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic featuresca
dc.type info:eu-repo/semantics/conferenceObject
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)