TALN at SemEval-2016 Task 14: semantic taxonomy enrichment via sense-based embeddings
TALN at SemEval-2016 Task 14: semantic taxonomy enrichment via sense-based embeddings
Citació
- Espinosa-Anke L, Ronzano F, Saggion H. TALN at SemEval-2016 Task 14: semantic taxonomy enrichment via sense-based embeddings. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016); 2016 Jun 16-17; San Diego, California. [San Diego]: Association for Computational Linguistics; 2016. p. 1332-36. DOI: 10.18653/v1/s16-1208
Enllaç permanent
Descripció
Resum
This paper describes the participation of the TALN team in SemEval-2016 Task 14: Semantic Taxonomy Enrichment. The purpose of the task is to find the best point of attachment in WordNet for a set of Out of Vocabulary (OOV) terms. These may come, to name a few, from domain specific glossaries, slang or typical jargon from Internet forums and chatrooms. Our contribution takes as input an OOV term, its part of speech and its associated definition, and generates a set of WordNet synset candidates derived from modelling the term’s definition as a sense embedding representation. We leverage a BabelNet-based vector space representation, which allows us to map the algorithm’s prediction to WordNet. Our approach is designed to be generic and fitting to any domain, without exploiting, for instance, HTML markup in source web pages. Our system performs above the median of all submitted systems, and rivals in performance a powerful baseline based on extracting the first word of the definition with the same partof-speech as the OOV term.