dc.contributor.author |
Arias Badia, Blanca |
dc.contributor.author |
Bel Rafecas, Núria |
dc.contributor.author |
Fomicheva, Marina |
dc.contributor.author |
Larrea Mendizabal, Imanol |
dc.contributor.author |
Lorente, Mercè |
dc.contributor.author |
Marimon, Montserrat |
dc.contributor.author |
Milà-Garcia, Alba |
dc.contributor.author |
Vivaldi, J. (Jorge), 1952- |
dc.contributor.author |
Padró, Muntsa |
dc.date.accessioned |
2021-01-21T09:10:12Z |
dc.date.available |
2021-01-21T09:10:12Z |
dc.date.issued |
2014 |
dc.identifier.citation |
Arias B, Bel N, Lorente M, Marimón M, Milà A, Vivaldi J, Padró M, Fomicheva M, Larrea I. Boosting the creation of a treebank. In: Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors. Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14); 2014 May 26-31; Reykjavik, Iceland. Paris: European Language Resources Association (ELRA); 2014. p. 775-81. |
dc.identifier.uri |
http://hdl.handle.net/10230/46232 |
dc.description |
Comunicació presentada al 9th International Conference on Language Resources and Evaluation (LREC'14), celebrat del 26 al 31 de maig de 2014 a Reykjavík, Islàndia. |
dc.description.abstract |
We present the results of the experiment of bootstrapping a Treebank for Catalan by using a Dependency Parser trained with Spanish sentences. In order to save time and cost, our approach was to profit from the typological similarities between Catalan and Spanish to create a first Catalan data set quickly by (i) automatically annotating with a delexicalized Spanish parser, (ii) manually correcting the parses, and (iii) using the Catalan corrected sentences to train a Catalan parser. The results showed that the number of parsed sentences required to train a Catalan parser is about 1000, which were achieved in 4 months with 2 annotators. |
dc.description.sponsorship |
This work was partially supported by the SKATER project (Ministerio de Economía y Competitividad, TIN2012-38584-C06-05). |
dc.format.mimetype |
application/pdf |
dc.language.iso |
eng |
dc.publisher |
ELRA (European Language Resources Association) |
dc.relation.ispartof |
Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors. Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14); 2014 May 26-31; Reykjavik, Iceland. Paris: European Language Resources Association (ELRA); 2014. p. 775-81 |
dc.rights |
Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License (https://creativecommons.org/licenses/by-nc-sa/3.0/) |
dc.rights.uri |
https://creativecommons.org/licenses/by-nc-sa/3.0/ |
dc.title |
Boosting the creation of a treebank |
dc.type |
info:eu-repo/semantics/conferenceObject |
dc.subject.keyword |
Dependency treebank |
dc.subject.keyword |
Treebank bootstrapping |
dc.subject.keyword |
Less resourced languages |
dc.relation.projectID |
info:eu-repo/grantAgreement/ES/3PN/TIN2012-38584-C06-05 |
dc.rights.accessRights |
info:eu-repo/semantics/openAccess |
dc.type.version |
info:eu-repo/semantics/publishedVersion |