Statistical & neural MT systems in the motorcycling domain for less frequent language pairs – how do professional post-editors perform?

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Ginovart Cid, Clara
  • dc.date.accessioned 2018-12-14T08:23:44Z
  • dc.date.available 2018-12-14T08:23:44Z
  • dc.date.issued 2018
  • dc.description Comunicació presentada a la 40th edition of Translating and the Computer Conference (TC40), celebrada a Londres (Regne Unit), del 15 al 16 de novembre de 2018.
  • dc.description.abstract As more language service providers (LSP) are including post-editing (PE) of machine translation (MT) in their workflow, we see how studies on quality evaluation of MT output become more and more important. We report findings from a user study that evaluates three MT engines (two phrase-based and one neural) from French into Spanish and Italian. We describe results from two text types: product description and blog post, both from a motorcycling website that was actually translated by Datawords Datasia. We use task-based evaluation (PE is the task), automatic evaluation metrics (BLEU, edit distance, and HTER) and human evaluation through ranking to establish which system requires less PE effort and we set the basis for a method to decide when an LSP could use MT and how to evaluate the output. Unfortunately, large parallel corpora are unavailable for some language pairs and domains. Motorcycling and the French language are low-resourced, and this represents the main limitation to this user study. It especially affects the performance of the neural model.
  • dc.description.sponsorship Ideas and results presented in this paper are part of Clara Ginovart Cid’s PhD research, conducted at Pompeu Fabra University, under the supervision of Pr. Carme Colominas and Pr. Antoni Oliver (Universitat Oberta de Catalunya), in collaboration with Datawords Datasia, under the supervision of Marina Frattino, supported through the Industrial Doctorate Programme.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Ginovart Cid C. Statistical & neural MT systems in the motorcycling domain for less frequent language pairs – how do professional post-editors perform?. In: Translating and the Computer 40: proceedings; 2018 Nov 15-16; London, United Kingdom. Geneva: Editions Tradulex; 2018. p. 66-78.
  • dc.identifier.uri http://hdl.handle.net/10230/36089
  • dc.language.iso eng
  • dc.publisher AsLing, The International Association for Advancement in Language Tehcnology
  • dc.relation.ispartof Translating and the Computer 40: proceedings; 2018 Nov 15-16; London, United Kingdom. Geneva: Editions Tradulex; 2018. p. 66-78.
  • dc.rights © AsLing, The International Association for Advancement in Language Tehcnology
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.subject.keyword NMT
  • dc.subject.keyword Post-editing
  • dc.subject.keyword Quality evaluation
  • dc.subject.keyword Machine translation
  • dc.title Statistical & neural MT systems in the motorcycling domain for less frequent language pairs – how do professional post-editors perform?
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion