A novel Spanish dataset for financial education text simplification targeting visually impaired individuals
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Pérez-Rojas, Nelson
- dc.contributor.author Calderón Ramírez, Saúl
- dc.contributor.author Solís, Martín
- dc.contributor.author Romero-Sandoval, Mario Alberto
- dc.contributor.author Arias-Monge, Monica
- dc.contributor.author Saggion, Horacio
- dc.date.accessioned 2025-07-16T07:21:39Z
- dc.date.available 2025-07-16T07:21:39Z
- dc.date.issued 2025
- dc.description.abstract Automatic Text Simplification (ATS) is a crucial task in natural language processing, aimed at making texts more comprehensible, particularly for specific groups such as individuals with visual impairments. One of the primary challenges in developing models for ATS is the scarcity of data, especially in Spanish. This manuscript introduces a novel dataset tailored for Spanish speakers with visual impairments, consisting of 5,314 pairs of original and simplified sentences created using established simplification rules. Additionally, we evaluate the feasibility of augmenting this dataset using large language models such as Generative Pre-training Transformer (GPT)-3, TUNER, and Multilingual T5 (mT5). We compare the simplifications generated by these models with our dataset to assess their effectiveness in data augmentation. The characteristics of our dataset and the findings from these comparisons are discussed in detail. The dataset is publicly available on Hugging Face at https://huggingface.co/datasets/saul1917/FEINA.en
- dc.description.sponsorship The work of Horacio Saggion was supported in part by the Maria de Maeztu Units of Excellence Program, funded by MCIN/AEI/10.13039/501100011033 under Grant CEX2021-001195-M; and in part by European Union’s Horizon Europe Research and Innovation Program through the iDEM Project under Grant 101132431.en
- dc.format.mimetype application/pdf
- dc.identifier.citation Pérez-Rojas N, Calderon-Ramirez S, Solís M, Romero-Sandoval MA, Arias-Monge M, Saggion H. A novel Spanish dataset for financial education text simplification targeting visually impaired individuals. IEEE Access. 2025;13:87472-84. DOI: 10.1109/access.2025.3568693
- dc.identifier.doi http://dx.doi.org/10.1109/access.2025.3568693
- dc.identifier.issn 2169-3536
- dc.identifier.uri http://hdl.handle.net/10230/70913
- dc.language.iso eng
- dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
- dc.relation.ispartof IEEE Access. 2025;13:87472-84
- dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/101132431
- dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/CEX2021-001195-M
- dc.rights © 2025 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Complexity theoryen
- dc.subject.keyword Measurementen
- dc.subject.keyword Standardsen
- dc.subject.keyword Multilingualen
- dc.subject.keyword Manualsen
- dc.subject.keyword Guidelinesen
- dc.subject.keyword Benchmark testingen
- dc.subject.keyword Annotationsen
- dc.subject.keyword Visualizationen
- dc.subject.keyword Tunersen
- dc.title A novel Spanish dataset for financial education text simplification targeting visually impaired individualsen
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion