Boosting the creation of a treebank

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Arias Badia, Blanca
  • dc.contributor.author Bel Rafecas, Núria
  • dc.contributor.author Fomicheva, Marina
  • dc.contributor.author Larrea Mendizabal, Imanol
  • dc.contributor.author Lorente, Mercè
  • dc.contributor.author Marimon, Montserrat
  • dc.contributor.author Milà-Garcia, Alba
  • dc.contributor.author Vivaldi, J. (Jorge), 1952-
  • dc.contributor.author Padró, Muntsa
  • dc.date.accessioned 2021-01-21T09:10:12Z
  • dc.date.available 2021-01-21T09:10:12Z
  • dc.date.issued 2014
  • dc.description Comunicació presentada al 9th International Conference on Language Resources and Evaluation (LREC'14), celebrat del 26 al 31 de maig de 2014 a Reykjavík, Islàndia.
  • dc.description.abstract We present the results of the experiment of bootstrapping a Treebank for Catalan by using a Dependency Parser trained with Spanish sentences. In order to save time and cost, our approach was to profit from the typological similarities between Catalan and Spanish to create a first Catalan data set quickly by (i) automatically annotating with a delexicalized Spanish parser, (ii) manually correcting the parses, and (iii) using the Catalan corrected sentences to train a Catalan parser. The results showed that the number of parsed sentences required to train a Catalan parser is about 1000, which were achieved in 4 months with 2 annotators.en
  • dc.description.sponsorship This work was partially supported by the SKATER project (Ministerio de Economía y Competitividad, TIN2012-38584-C06-05).
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Arias B, Bel N, Lorente M, Marimón M, Milà A, Vivaldi J, Padró M, Fomicheva M, Larrea I. Boosting the creation of a treebank. In: Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors. Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14); 2014 May 26-31; Reykjavik, Iceland. Paris: European Language Resources Association (ELRA); 2014. p. 775-81.
  • dc.identifier.uri http://hdl.handle.net/10230/46232
  • dc.language.iso eng
  • dc.publisher ELRA (European Language Resources Association)
  • dc.relation.ispartof Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors. Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14); 2014 May 26-31; Reykjavik, Iceland. Paris: European Language Resources Association (ELRA); 2014. p. 775-81
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PN/TIN2012-38584-C06-05
  • dc.rights Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License (https://creativecommons.org/licenses/by-nc-sa/3.0/)
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri https://creativecommons.org/licenses/by-nc-sa/3.0/
  • dc.subject.keyword Dependency treebanken
  • dc.subject.keyword Treebank bootstrappingen
  • dc.subject.keyword Less resourced languagesen
  • dc.title Boosting the creation of a treebanken
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion