Show simple item record Pimienta Castillo, Jorge S. 2021-12-15T12:31:56Z 2021-12-15T12:31:56Z 2021-09
dc.description Treball fi de màster de: Master in Intelligent Interactive Systems
dc.description Tutor: Horacio Saggion
dc.description.abstract This report describes, implement, and evaluate one strategy for text simplification, namely, Lexical Simplification, that aims to reduce the complexity of some words in a sentence. This process is done in two main steps, the first, is a module that identifies the complex elements, and the second, is a module that replaces those elements for simpler variants. For the first module, the system will use three different datasets that include human annotations in different languages: English, Spanish, and German, this will allow us to train a classifier that detects complex words. For the second module, a pre-trained model for word prediction (BERT) will be used to generate the candidates, the candidates will be sorted based on Zipf’s frequency, to later select the one with the highest value. Finally, the complete system is evaluated using a test dataset, and a survey designed to collect human annotations and perception of Fluency, Meaning and Simplicity.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.rights Attribution-NonCommercial- NoDerivs 4.0 International
dc.title Multilingual lexical simplification
dc.type info:eu-repo/semantics/masterThesis
dc.subject.keyword Complex word identification
dc.subject.keyword Masked language model
dc.subject.keyword Lexical simplification
dc.subject.keyword Word frequency
dc.subject.keyword Model evaluation
dc.rights.accessRights info:eu-repo/semantics/openAccess


This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account


In collaboration with Compliant to Partaking