Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Dogru, Gokhan
- dc.contributor.author Moorkens, Joss
- dc.date.accessioned 2025-01-28T15:20:28Z
- dc.date.available 2025-01-28T15:20:28Z
- dc.date.issued 2024
- dc.description.abstract This study aims to investigate the effect of data augmentation through translation memories for desktop machine translation (MT) fine-tuning in OPUS-CAT. It also focuses on assessing the usefulness of desktop MT for professional translators. Engines in three language pairs (English → Turkish, English → Spanish, and English → Catalan) are fine-tuned with corpora of two different sizes. The translation quality of each engine is measured through automatic evaluation metrics (BLEU, chrF2, TER and COMET) and human evaluation metrics (ranking, adequacy and fluency). Overall evaluation results indicate promising quality improvements in all three language pairs and imply that the use of desktop MT applications such as OPUS-CAT and fine-tuning MT engines with custom data in a translator’s desktop can potentially provide high-quality translations aside from their advantages such as privacy, confidentiality and low use of computation power.
- dc.format.mimetype application/pdf
- dc.identifier.citation Dogru G, Moorkens J. Data Augmentation with Translation Memories for Desktop Machine Translation Fine-tuning in 3 Language Pairs. The Journal of Specialised Translation. 2024;41:149–78. DOI: 10.26034/cm.jostrans.2024.4716
- dc.identifier.doi http://dx.doi.org/10.26034/cm.jostrans.2024.4716
- dc.identifier.issn 1740-357X
- dc.identifier.uri http://hdl.handle.net/10230/69349
- dc.language.iso eng
- dc.publisher Jostrans (Journal of Specialised Translation)
- dc.relation.ispartof The Journal of Specialised Translation. 2024;41:149–78
- dc.rights This work is licensed under a Creative Commons Attribution 4.0 International License.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri http://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Machine translation fine-tuning
- dc.subject.keyword Domain adaptation
- dc.subject.keyword Desktop machine translation
- dc.subject.keyword Localization
- dc.subject.keyword Parallel corpora
- dc.subject.keyword Professional translators
- dc.subject.keyword Machine translation evaluation
- dc.title Data augmentation with translation memories for desktop machine translation fine-tuning in 3 language pairs
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion