Adaptive smoothing for path integral control
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Thalmeier, Dominik
- dc.contributor.author Kappen, Hilbert J.
- dc.contributor.author Totaro, Simone
- dc.contributor.author Gómez, Vicenç
- dc.date.accessioned 2021-03-09T09:36:51Z
- dc.date.available 2021-03-09T09:36:51Z
- dc.date.issued 2020
- dc.description.abstract In Path Integral control problems a representation of an optimally controlled dynamical system can be formally computed and serve as a guidepost to learn a parametrized policy. The Path Integral Cross-Entropy (PICE) method tries to exploit this, but is hampered by poor sample efficiency. We propose a model-free algorithm called ASPIC (Adaptive Smoothing of Path Integral Control) that applies an inf-convolution to the cost function to speedup convergence of policy optimization. We identify PICE as the infinite smoothing limit of such technique and show that the sample efficiency problems that PICE suffers disappear for finite levels of smoothing. For zero smoothing, ASPIC becomes a greedy optimization of the cost, which is the standard approach in current reinforcement learning. ASPIC adapts the smoothness parameter to keep the variance of the gradient estimator at a predefined level, independently of the number of samples. We show analytically and empirically that intermediate levels of smoothing are optimal, which renders the new method superior to both PICE and direct cost optimization.
- dc.format.mimetype application/pdf
- dc.identifier.citation Thalmeier D, Kappen HJ, Totaro S, Gómez V. Adaptive smoothing for path integral control. Journal of Machine Learning Research. 2020 Nov; 21(191): 1-37.
- dc.identifier.issn 1532-4435
- dc.identifier.uri http://hdl.handle.net/10230/46698
- dc.language.iso eng
- dc.publisher Microtome Publishing
- dc.relation.ispartof Journal of Machine Learning Research. 2020 Nov; 21(191): 1-37
- dc.rights © 2020 Dominik Thalmeier, Hilbert J. Kappen, Simone Totaro, Vicenç Gómez. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/18-624.html.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri https://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Path Integral Control
- dc.subject.keyword Entropy-Regularization
- dc.subject.keyword Cost Smoothing
- dc.title Adaptive smoothing for path integral control
- dc.type info:eu-repo/semantics/article
- dc.type.version info:eu-repo/semantics/publishedVersion