A multi-layered annotated corpus of scientific papers

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Fisas Elizalde, Beatrizca
  • dc.contributor.author Ronzano, Francescoca
  • dc.contributor.author Saggion, Horacioca
  • dc.date.accessioned 2018-01-30T08:40:45Z
  • dc.date.available 2018-01-30T08:40:45Z
  • dc.date.issued 2016
  • dc.description Comunicació presentada a la Tenth International Conference on Language Resources and Evaluation (LREC 2016), celebrada els dies 23 a 28 de maig de 2016 a Portorož, Eslovènia.
  • dc.description.abstract Scientific literature records the research process with a standardized structure and provides the clues to track the progress in a scientific field. Understanding its internal structure and content is of paramount importance for natural language processing (NLP) technologies. To meet this requirement, we have developed a multi-layered annotated corpus of scientific papers in the domain of Computer Graphics. Sentences are annotated with respect to their role in the argumentative structure of the discourse. The purpose of each citation is specified. Special features of the scientific discourse such as advantages and disadvantages are identified. In addition, a grade is allocated to each sentence according to its relevance for being included in a summary.To the best of our knowledge, this complex, multi-layered collection of annotations and metadata characterizing a set of research papers had never been grouped together before in one corpus and therefore constitutes a newer, richer resource with respect to those currently available in the field.en
  • dc.description.sponsorship The research leading to these results has received funding from the European Project Dr. Inventor (FP7-ICT-2013.8.1 - grant agreement no 611383) and is partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502).
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Fisas B, Ronzano F, Saggion H. A multi-layered annotated corpus of scientific papers. In: Calzolari N, Choukri K, Declerck T, Goggi S, Grobelnik M, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, editors. LREC 2016. Tenth International Conference on Language Resources and Evaluation; 2016 May 23-28; Portorož, Slovenia. [Paris]: ELRA; 2016. p. 3081-8.
  • dc.identifier.uri http://hdl.handle.net/10230/33778
  • dc.language.iso eng
  • dc.publisher ELRA (European Language Resources Association)ca
  • dc.relation.ispartof Calzolari N, Choukri K, Declerck T, Goggi S, Grobelnik M, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, editors. LREC 2016. Tenth International Conference on Language Resources and Evaluation; 2016 May 23-28; Portorož, Slovenia. [Paris]: ELRA; 2016. p. 3081-8.
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/611383
  • dc.rights © any ELRA - European Language Resources Association. All rights reserved. The LREC 2016 Proceedings are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
  • dc.subject.keyword Multi-layered annotated corpusen
  • dc.subject.keyword Scientific discourseen
  • dc.subject.keyword Citationsen
  • dc.subject.keyword Summarization gold standarden
  • dc.title A multi-layered annotated corpus of scientific papersca
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion