Costa-jussà, Marta R.Fonollosa, José A RodriguezMariño Acebal, José B.Poch, MarcFarrús, Mireia2016-05-112016-05-112014Costa-Jussa MR, Fonollosa JAR, Marino JB, Poch M, Farrus M. A large Spanish-Catalan parallel corpus release for machine translation. Computing and Informatics. 2014;33(4):907-20.1335-9150http://hdl.handle.net/10230/26266We present a large Spanish-Catalan parallel corpus extracted from ten years of the paper edition of a bilingual Catalan newspaper. The produced corpus of 7:5M parallel sentences (around 180M words per language) is useful for many natural language applications. We report excellent results when building a statistical machine translation system trained on this parallel corpus. The Spanish-Catalan corpus is partially available via ELDA (Evaluations and Language Resources Distribution Agency) in catalog number ELRA-W0053.application/pdfeng© Institute of Informatics Slovak Academy of SciencesA large Spanish-Catalan parallel corpus release for machine translationinfo:eu-repo/semantics/articleCatalan-Spanish parallel corpusMachine translationinfo:eu-repo/semantics/openAccess