TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features

Citació

  • Ronzano F, Abura'ed A, Espinosa-Anke L. TALN at SemEval-2016 Task 11: modelling complex words by contextual, lexical and semantic features. In: SemEval-2016. The 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 1011–6.

Enllaç permanent

Descripció

  • Resum

    This paper presents the participation of the TALN team in the Complex Word Identification Task of SemEval-2016 (Task 11). The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. To experiment with word complexity identification approaches, Task organizers provided a training set of 2,237 words judged as complex or not by 20 human evaluators, together with the sentence in which each word occurs. In our contribution we modelled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features that may help assess the complexity of a word. We trained a Random Forest classifier to automatically decide if each word is complex or not. We submitted two runs in which we respectively considered unweighted and weighted instances of complex words to train our classifier, where the weight of each instance is proportional to the number of evaluators that judged the word as complex. Our system scored as the third best performing one.
  • Descripció

    Comunicació presentada al 10th International Workshop on Semantic Evaluation (SemEval 2016), celebrat els dies 16 i 17 de juny de 2016 a San Diego, EUA.
  • Mostra el registre complet