Lexical complexity prediction and lexical simplification for catalan and spanish: resource creation, quality assessment, and ethical considerations

dc.contributor.authorSaggion, Horacio
dc.contributor.authorBott, Stefan Markus
dc.contributor.authorSzasz, Sandra
dc.contributor.authorPérez, Nelson
dc.contributor.authorCalderón Ramírez, Saúl
dc.contributor.authorSolís, Martín
dc.date.accessioned2024-11-28T08:31:44Z
dc.date.available2024-11-28T08:31:44Z
dc.date.issued2024
dc.descriptionComunicació presentada al 3rd Workshop on Text Simplification, Accessibility and Readability (TSAR 2024), celebrat a Miami (EUA) el 15 de novembre de 2024.
dc.description.abstractAutomatic lexical simplification is a task to substitute lexical items that may be unfamiliar and difficult to understand with easier and more common words. This paper presents the description and analysis of two novel datasets for lexical simplification in Spanish and Catalan. This dataset represents the first of its kind in Catalan and a substantial addition to the sparse data on automatic lexical simplification which is available for Spanish. Specifically, it is the first dataset for Spanish which includes scalar ratings of the understanding difficulty of lexical items. In addition, we present a detailed analysis aiming at assessing the appropriateness and ethical dimensions of the data for the lexical simplification task.
dc.format.mimetypeapplication/pdf
dc.identifier.citationSaggion H, Bott S, Szasz S, Pérez N, Calderón S, Solís M. Lexical complexity prediction and lexical simplification for catalan and spanish: resource creation, quality assessment, and ethical considerations. In: Shardlow M, Saggion H, Alva-Manchego F, Zampieri M, North K, Štajner S, Stodden R, editors. Proceedings of the third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024); 2024 Nov 15; Miami, USA. Kerrville: Association for Computational Linguistics; 2024. p 82-94.
dc.identifier.urihttp://hdl.handle.net/10230/68853
dc.language.isoeng
dc.publisherACL (Association for Computational Linguistics)
dc.relation.ispartofIn: Shardlow M, Saggion H, Alva-Manchego F, Zampieri M, North K, Štajner S, Stodden R, editors. Proceedings of the third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024); 2024 Nov 15; Miami, USA. Kerrville: Association for Computational Linguistics; 2024. p 82-94.
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/HE/101132431
dc.rights© ACL, Creative Commons Attribution 4.0 License
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.subject.keywordAutomatic lexical simplification
dc.subject.keywordSpanish lexicon
dc.subject.keywordCatalan lexicon
dc.titleLexical complexity prediction and lexical simplification for catalan and spanish: resource creation, quality assessment, and ethical considerations
dc.typeinfo:eu-repo/semantics/conferenceObject
dc.type.versioninfo:eu-repo/semantics/publishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Saggion_TSAR_lexi.pdf
Size:
203.02 KB
Format:
Adobe Portable Document Format