Learning general policies from small examples without supervision

Francès, Guillem; Bonet, Blai; Geffner, Héctor

Learning general policies from small examples without supervision

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Francès, Guillem
dc.contributor.author Bonet, Blai
dc.contributor.author Geffner, Héctor
dc.date.accessioned 2023-02-07T13:21:49Z
dc.date.available 2023-02-07T13:21:49Z
dc.date.issued 2021
dc.description.abstract Generalized planning is concerned with the computation of general policies that solve multiple instances of a planning domain all at once. It has been recently shown that these policies can be computed in two steps: first, a suitable abstraction in the form of a qualitative numerical planning problem (QNP) is learned from sample plans, then the general policies are obtained from the learned QNP using a planner. In this work, we introduce an alternative approach for computing more expressive general policies which does not require sample plans or a QNP planner. The new formulation is very simple and can be cast in terms that are more standard in machine learning: a large but finite pool of features is defined from the predicates in the planning examples using a general grammar, and a small subset of features is sought for separating “good” from “bad” state transitions, and goals from non-goals. The problems of finding such a “separating surface” while labeling the transitions as “good” or “bad” are jointly addressed as a single combinatorial optimization problem expressed as a Weighted Max-SAT problem. The advantage of looking for the simplest policy in the given feature space that solves the given examples, possibly non-optimally, is that many domains have no general, compact policies that are optimal. The approach yields general policies for a number of benchmark domains.
dc.description.sponsorship This research is partially funded by an ERC Advanced Grant (No 885107), by grant TIN-2015-67959-P from MINECO, Spain, and by the Knut and Alice Wallenberg (KAW) Foundation through the WASP program. H. Geffner is also a Wallenberg Guest Professor at Linköping University, Sweden. G. Francès is partially supported by grant IJC2019-039276-I from MICINN, Spain.
dc.format.mimetype application/pdf
dc.identifier.citation Francès G, Bonet B, Geffner H. Learning general policies from small examples without supervision. Proc Conf AAAI Artif Intell. 2021;35(13):11801-8. DOI: 10.1609/aaai.v35i13.17402
dc.identifier.doi http://dx.doi.org/10.1609/aaai.v35i13.17402
dc.identifier.issn 2159-5399
dc.identifier.uri http://hdl.handle.net/10230/55664
dc.language.iso eng
dc.publisher Association for the Advancement of Artificial Intelligence (AAAI)
dc.relation.ispartof Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(13):11801-8.
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/885107
dc.relation.projectID info:eu-repo/grantAgreement/ES/1PE/TIN-2015-67959-P
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/ IJC2019-039276-I
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.subject.other Planificació
dc.title Learning general policies from small examples without supervision
dc.type info:eu-repo/semantics/article
dc.type.version info:eu-repo/semantics/acceptedVersion

Col·leccions

Articles (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)