A new AI evaluation cosmos: ready to play the game?

Hernández-Orallo, José; Baroni, Marco; Bieger, Jordi; Chmait, Nader; Dowe, David L.; Hofmann, Katja; Martínez Plumed, Fernando; Strannegård, Claes; Thórissons, Kristinn R.

A new AI evaluation cosmos: ready to play the game?

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Hernández-Orallo, José
dc.contributor.author Baroni, Marco
dc.contributor.author Bieger, Jordi
dc.contributor.author Chmait, Nader
dc.contributor.author Dowe, David L.
dc.contributor.author Hofmann, Katja
dc.contributor.author Martínez Plumed, Fernando
dc.contributor.author Strannegård, Claes
dc.contributor.author Thórissons, Kristinn R.
dc.date.accessioned 2021-05-19T09:50:48Z
dc.date.available 2021-05-19T09:50:48Z
dc.date.issued 2017
dc.description.abstract We report on a series of new platforms and events dealing with AI evaluation that may change the way in which AI systems are compared and their progress is measured. The introduction of a more diverse and challenging set of tasks in these platforms can feed AI research in the years to come, shaping the notion of success and the directions of the field. However, the playground of tasks and challenges presented there may misdirect the field without some meaningful structure and systematic guidelines for its organization and use. Anticipating this issue, we also report on several initiatives and workshops that are putting the focus on analyzing the similarity and dependencies between tasks, their difficulty, what capabilities they really measure and – ultimately – on elaborating new concepts and tools that can arrange tasks and benchmarks into a meaningful taxonomy.en
dc.format.mimetype application/pdf
dc.identifier.citation Hernandez-Orallo J, Baroni M, Bieger J, Chamit N, Dowe DL, Hofmann K, Martinez-Plumed F, Strannegård C, Thórissons KR. A new AI evaluation cosmos: ready to play the game?. AI Mag. 2017 Oct 2;38(3):66-9. DOI: 10.1609/aimag.v38i3.2748
dc.identifier.doi http://dx.doi.org/10.1609/aimag.v38i3.2748
dc.identifier.issn 0738-4602
dc.identifier.uri http://hdl.handle.net/10230/47604
dc.language.iso eng
dc.publisher Association for the Advancement of Artificial Intelligence (AAAI)
dc.relation.ispartof AI Magazine. 2017 Oct 2;38(3):66-9
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.title A new AI evaluation cosmos: ready to play the game?en
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/acceptedVersion

Col·leccions

Articles (Departament de Traducció i Ciències del Llenguatge)