SICK through the SemEval glasses: lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment

Bentivogli, Luisa; Bernardi, Raffaella; Marelli, Marco; Menini, Stefano; Baroni, Marco; Zamparelli, Roberto

SICK through the SemEval glasses: lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Bentivogli, Luisa
dc.contributor.author Bernardi, Raffaella
dc.contributor.author Marelli, Marco
dc.contributor.author Menini, Stefano
dc.contributor.author Baroni, Marco
dc.contributor.author Zamparelli, Roberto
dc.date.accessioned 2020-12-02T09:07:38Z
dc.date.available 2020-12-02T09:07:38Z
dc.date.issued 2016
dc.description.abstract This paper is an extended description of SemEval-2014 Task 1, the task on the evaluation of Compositional Distributional Semantics Models on full sentences. Systems participating in the task were presented with pairs of sentences and were evaluated on their ability to predict human judgments on (1) semantic relatedness and (2) entailment. Training and testing data were subsets of the SICK (Sentences Involving Compositional Knowledge) data set. SICK was developed with the aim of providing a proper benchmark to evaluate compositional semantic systems, though task participation was open to systems based on any approach. Taking advantage of the SemEval experience, in this paper we analyze the SICK data set, in order to evaluate the extent to which it meets its design goal and to shed light on the linguistic phenomena that are still challenging for state-of-the-art computational semantic systems. Qualitative and quantitative error analyses show that many systems are quite sensitive to changes in the proportion of sentence pair types, and degrade in the presence of additional lexico-syntactic complexities which do not affect human judgements. More compositional systems seem to perform better when the task proportions are changed, but the effect needs further confirmation.en
dc.description.sponsorship We thank the creators of the ImageFlickr, MSR-Video, and SemEval-2012 STS data sets for granting us permission to use their data for the task. The University of Trento authors were supported by ERC 2011 Starting Independent Research Grant No. 283554 (COMPOSES).
dc.format.mimetype application/pdf
dc.identifier.citation Bentivogli L, Bernardi R, Marelli M, Menini S, Baroni M, Zamparelli R. SICK through the SemEval glasses: lessons learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. Lang Resour Eval. 2016 Jan 11;50:95-124. DOI: 10.1007/s10579-015-9332-5
dc.identifier.doi http://dx.doi.org/10.1007/s10579-015-9332-5
dc.identifier.issn 1574-020X
dc.identifier.uri http://hdl.handle.net/10230/45932
dc.language.iso eng
dc.publisher Springer
dc.relation.ispartof Language Resources and Evaluation. 2016 Jan 11;50:95-124
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/283554
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.subject.keyword Compositionalityen
dc.subject.keyword Computational semanticsen
dc.subject.keyword Distributional semantics modelsen
dc.title SICK through the SemEval glasses: lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailmenten
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/acceptedVersion

Col·leccions

Articles (Departament de Traducció i Ciències del Llenguatge)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)