Multilingual lexical simplification
Multilingual lexical simplification
Enllaç permanent
Descripció
Resum
This report describes, implement, and evaluate one strategy for text simplification, namely, Lexical Simplification, that aims to reduce the complexity of some words in a sentence. This process is done in two main steps, the first, is a module that identifies the complex elements, and the second, is a module that replaces those elements for simpler variants. For the first module, the system will use three different datasets that include human annotations in different languages: English, Spanish, and German, this will allow us to train a classifier that detects complex words. For the second module, a pre-trained model for word prediction (BERT) will be used to generate the candidates, the candidates will be sorted based on Zipf’s frequency, to later select the one with the highest value. Finally, the complete system is evaluated using a test dataset, and a survey designed to collect human annotations and perception of Fluency, Meaning and Simplicity.Descripció
Treball fi de màster de: Master in Intelligent Interactive Systems
Tutor: Horacio Saggion