The relevance of non-human errors in machine learning

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Baeza Yates, Ricardo
  • dc.contributor.author Estévez Almenzar, Marina
  • dc.date.accessioned 2023-02-23T07:10:54Z
  • dc.date.available 2023-02-23T07:10:54Z
  • dc.date.issued 2022
  • dc.description Comunicació presentada a Workshop on AI Evaluation Beyond Metrics (EBeM 2022), celebrat el 25 de juliol de 2022 a Viena, Àustria.
  • dc.description.abstract The current practice of focusing the evaluation of a machine learning model on the accuracy of validation has been lately questioned, and has been declared as a systematic habit that is ignoring some important aspects when developing a possible solution to a problem. This lack of diversity in evaluation procedures reinforces the difference between human and machine perception on the relevance of data features, and reinforces the lack of alignment between the fidelity of current benchmarks and human-centered tasks. Hence, we argue that there is an urgent need to start paying more attention to the search for metrics that, given a task, take into account the most humanly relevant aspects. We propose to base this search on the errors made by the machine and the consequent risks involved in moving human logic away from that of the machine. If we work on identifying these errors and organize them hierarchically according to this logic, we can use this information to provide a reliable evaluation of machine learning models, and improve the alignment between training processes and the different considerations humans make when solving a problem and analyzing outcomes. In this context we define the concept of non-human errors, exemplifying it with an image classification task and discussing its implications.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Baeza-Yates R, Estévez-Almenzar M. The relevance of non-human errors in machine learning. In: Hernández-Orallo J, Cheke L, Tenebaum J, Ullman T, Martínez-Plumed F, Rutar D, Burden J, Burnell R, Schellaert W, editors. Proceedings of the Workshop on AI Evaluation Beyond Metrics (EBeM 2022); 2022 Jul 25; Vienna, Austria. [Aachen]: CEUR-WS; 2022. [6 p.].
  • dc.identifier.issn 1613-0073
  • dc.identifier.uri http://hdl.handle.net/10230/55884
  • dc.language.iso eng
  • dc.publisher CEUR Workshop Proceedings
  • dc.relation.ispartof Hernández-Orallo J, Cheke L, Tenebaum J, Ullman T, Martínez-Plumed F, Rutar D, Burden J, Burnell R, Schellaert W, editors. Proceedings of the Workshop on AI Evaluation Beyond Metrics (EBeM 2022); 2022 Jul 25; Vienna, Austria. [Aachen]: CEUR-WS; 2022. [6 p.].
  • dc.relation.isreferencedby https://github.com/ealmenzar/non-human-errors
  • dc.rights © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri https://creativecommons.org/licenses/by/4.0
  • dc.subject.keyword Machine Learning
  • dc.subject.keyword Responsible AI
  • dc.subject.keyword Evaluation
  • dc.subject.keyword Error Analysis
  • dc.subject.keyword Non-Human Errors
  • dc.title The relevance of non-human errors in machine learning
  • dc.type info:eu-repo/semantics/conferenceObject
  • dc.type.version info:eu-repo/semantics/publishedVersion