Welcome to the UPF Digital Repository

Multilabel prototype generation for data reduction in k-nearest neighbour classification

Show simple item record

dc.contributor.author Valero-Mas, Jose J.
dc.contributor.author Gallego, Antonio Javier
dc.contributor.author Alonso-Jiménez, Pablo
dc.contributor.author Serra, Xavier
dc.date.accessioned 2023-03-14T07:11:35Z
dc.date.available 2023-03-14T07:11:35Z
dc.date.issued 2023
dc.identifier.citation Valero-Mas JJ, Gallego AJ, Alonso-Jiménez P, Serra X. Multilabel prototype generation for data reduction in k-nearest neighbour classification. Pattern Recognition. 2023 Mar;135:109190. DOI: 10.1016/j.patcog.2022.109190
dc.identifier.issn 0031-3203
dc.identifier.uri http://hdl.handle.net/10230/56207
dc.description.abstract Prototype Generation (PG) methods are typically considered for improving the efficiency of the k-Nearest Neighbour (kNN) classifier when tackling high-size corpora. Such approaches aim at generating a reduced version of the corpus without decreasing the classification performance when compared to the initial set. Despite their large application in multiclass scenarios, very few works have addressed the proposal of PG methods for the multilabel space. In this regard, this work presents the novel adaptation of four multiclass PG strategies to the multilabel case. These proposals are evaluated with three multilabel kNN-based classifiers, 12 corpora comprising a varied range of domains and corpus sizes, and different noise scenarios artificially induced in the data. The results obtained show that the proposed adaptations are capable of significantly improving—both in terms of efficiency and classification performance—the only reference multilabel PG work in the literature as well as the case in which no PG method is applied, also presenting statistically superior robustness in noisy scenarios. Moreover, these novel PG strategies allow prioritising either the efficiency or efficacy criteria through its configuration depending on the target scenario, hence covering a wide area in the solution space not previously filled by other works.
dc.description.sponsorship This research was partially funded by the Spanish Ministerio de Ciencia e Innovación through the MultiScore (PID2020-118447RA-I00) and DOREMI (TED2021-132103A-I00) projects. The first author is supported by grant APOSTD/2020/256 from “Programa I+D+i de la Generalitat Valenciana”.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher Elsevier
dc.relation.ispartof Pattern Recognition. 2023 Mar;135:109190
dc.rights © 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.title Multilabel prototype generation for data reduction in k-nearest neighbour classification
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1016/j.patcog.2022.109190
dc.subject.keyword Multilabel classification
dc.subject.keyword Prototype generation
dc.subject.keyword Efficient
dc.subject.keyword kNN
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2020-118447RA-I00
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/TED2021-132103A-I00
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics

Compliant to Partaking