A deeper exploration of the standard PB-SMT approach to text simplification and its evaluation

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Štajner, Sanja
  • dc.contributor.author Béchara, Hannah
  • dc.contributor.author Saggion, Horacio
  • dc.date.accessioned 2018-11-06T18:04:16Z
  • dc.date.available 2018-11-06T18:04:16Z
  • dc.date.issued 2015
  • dc.description Comunicació presentada a: the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing del 26 al 31 de juliol de 2015 a Beijing, Xina.ca
  • dc.description.abstract In the last few years, there has been a growing number of studies addressing the Text Simplification (TS) task as a monolingual machine translation (MT) problem which translates from ‘original’ to ‘simple’ language. Motivated by those results, we investigate the influence of quality vs quantity of the training data on the effectiveness of such a MT approach to text simplification. We conduct 40 experiments on the aligned sentences from English Wikipedia and Simple English Wikipedia, controlling for: (1) the similarity between the original and simplified sentences in the training and development datasets, and (2) the sizes of those datasets. The results suggest that in the standard PB-SMT approach to text simplification the quality of the datasets has a greater impact on the system performance. Additionally, we point out several important differences between cross-lingual MT and monolingual MT used in text simplification, and show that BLEU is not a good measure of system performance in text simplification task.en
  • dc.description.sponsorship The research described in this paper was partially funded by the project SKATER-UPFTALN (TIN2012-38584-C06-03), Ministerio de Econom´ıa y Competitividad, Secretar´ıa de Estado de Investigaci´on, Desarrollo e Innovaci´on, Spain, and the project ABLE-TO-INCLUDE (CIP-ICTPSP- 2013-7/621055). Hannah B´echara is supported by the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme FP7/2007-2013/ under REA grant agreement no. 31747.en
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Štajner S, Béchara H, Saggion H. A Deeper exploration of the standard PB-SMT approach to text simplification and its evaluation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers); 2015 Jul 26-31; Beijing, China. Stroudsburg: ACL; 2015. p. 823-8.
  • dc.identifier.isbn 978-194164373-0
  • dc.identifier.uri http://hdl.handle.net/10230/35710
  • dc.language.iso eng
  • dc.publisher ACL (Association for Computational Linguistics)
  • dc.relation.ispartof Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers); 2015 Jul 26-31; Beijing, China. Stroudsburg: ACL; 2015. p. 823-8.
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PN/TIN2012-38584-C06-03
  • dc.rights © ACL, Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.title A deeper exploration of the standard PB-SMT approach to text simplification and its evaluation
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion