MSC+: Language pattern learning for word sense induction and disambiguation
MSC+: Language pattern learning for word sense induction and disambiguation
Citació
- Bif Goularte F, Sorato D. Modesto Nassar S, Fileto R, Saggion H. MSC+: Language pattern learning for word sense induction and disambiguation. Know. Based Systems. 2020;188:105017. DOI: 10.1016/j.knosys.2019.105017
Enllaç permanent
Descripció
Resum
Identifying the correct meaning of words in context or discovering new word senses is particularly useful for several tasks such as question answering, information extraction, information retrieval, and text summarization. However, specially in the context of user-generated contents and on-line communication (e.g. Twitter), new meanings are continuously crafted by speakers as the result of existing words being used in novel contexts. Consequently, lexical semantics inventories and systems have difficulties to cope with semantic drifting problems. In this work, we propose an approach to induce and disambiguate word senses of some target words in collections of short texts, such as tweets, through the use of fuzzy lexico-semantic patterns that we define as sequences of Morpho-semantic Components (MSC+). We learn these patterns, that we call patterns, from text data automatically. Experimental results show that instances of some patterns arise in a number of tweets, but sometimes using different words to convey the sense of the respective MSC+ in some tweets where pattern instances appear. The exploitation of MSC+ patterns when they induce semantics on target words enable effective word sense disambiguation mechanisms leading to improvements in the state of the art.