dc.contributor.author |
Štajner, Sanja |
dc.contributor.author |
Béchara, Hannah |
dc.contributor.author |
Saggion, Horacio |
dc.date.accessioned |
2018-11-06T18:04:16Z |
dc.date.available |
2018-11-06T18:04:16Z |
dc.date.issued |
2015 |
dc.identifier.citation |
Štajner S, Béchara H, Saggion H. A Deeper exploration of the standard PB-SMT approach to text simplification and its evaluation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers); 2015 Jul 26-31; Beijing, China. Stroudsburg: ACL; 2015. p. 823-8. |
dc.identifier.isbn |
978-194164373-0 |
dc.identifier.uri |
http://hdl.handle.net/10230/35710 |
dc.description |
Comunicació presentada a: the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing del 26 al 31 de juliol de 2015 a Beijing, Xina. |
dc.description.abstract |
In the last few years, there has been a growing number of studies addressing the Text Simplification (TS) task as a monolingual machine translation (MT) problem which translates from ‘original’ to ‘simple’ language. Motivated by those results, we investigate the influence of quality vs quantity of the training data on the effectiveness of such a MT approach to text simplification. We conduct 40 experiments on the aligned sentences from English Wikipedia and Simple English Wikipedia, controlling for: (1) the similarity between the original and simplified sentences in the training and development datasets, and (2) the sizes of those datasets. The results suggest that in the standard PB-SMT approach to text simplification the quality of the datasets has a greater impact on the system performance. Additionally, we point out several important differences between cross-lingual MT and monolingual MT used in text simplification, and show that BLEU is not a good measure of system performance in text simplification task. |
dc.description.sponsorship |
The research described in this paper was partially funded by the project SKATER-UPFTALN (TIN2012-38584-C06-03), Ministerio de Econom´ıa y Competitividad, Secretar´ıa de Estado de Investigaci´on, Desarrollo e Innovaci´on, Spain, and the project ABLE-TO-INCLUDE (CIP-ICTPSP- 2013-7/621055). Hannah B´echara is supported by the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme FP7/2007-2013/ under REA grant agreement no. 31747. |
dc.format.mimetype |
application/pdf |
dc.language.iso |
eng |
dc.publisher |
ACL (Association for Computational Linguistics) |
dc.relation.ispartof |
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers); 2015 Jul 26-31; Beijing, China. Stroudsburg: ACL; 2015. p. 823-8. |
dc.rights |
© ACL, Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License |
dc.title |
A deeper exploration of the standard PB-SMT approach to text simplification and its evaluation |
dc.type |
info:eu-repo/semantics/conferenceObject |
dc.relation.projectID |
info:eu-repo/grantAgreement/ES/3PN/TIN2012-38584-C06-03 |
dc.rights.accessRights |
info:eu-repo/semantics/openAccess |
dc.type.version |
info:eu-repo/semantics/publishedVersion |