Welcome to the UPF Digital Repository

Optimal control as a graphical model inference problem

Show simple item record

dc.contributor.author Kappen, Hilbert J.
dc.contributor.author Gómez, Vicenç
dc.contributor.author Opper, Manfred
dc.date.accessioned 2019-01-22T09:11:02Z
dc.date.available 2019-01-22T09:11:02Z
dc.date.issued 2012
dc.identifier.citation Kappen HJ, Gómez V, Opper M. Optimal control as a graphical model inference problem. Mach Learn. 2012 May;87(2):159-82. DOI: 10.1007/s10994-012-5278-7
dc.identifier.issn 0885-6125
dc.identifier.uri http://hdl.handle.net/10230/36355
dc.description.abstract We reformulate a class of non-linear stochastic optimal control problems introduced by Todorov (in Advances in Neural Information Processing Systems, vol. 19, pp. 1369-1376, 2007) as a Kullback-Leibler (KL) minimization problem. As a result, the optimal control computation reduces to an inference computation and approximate inference methods can be applied to efficiently compute approximate optimal controls. We show how this KL control theory contains the path integral control method as a special case. We provide an example of a block stacking task and a multi-agent cooperative game where we demonstrate how approximate inference can be successfully applied to instances that are too complex for exact computation. We discuss the relation of the KL control approach to other inference approaches to control.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.publisher Springer
dc.relation.ispartof Machine Learning. 2012 May;87(2):159-82.
dc.rights © Springer The final publication is available at Springer via http://dx.doi.org/10.1007/s10994-012-5278-7
dc.title Optimal control as a graphical model inference problem
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1007/s10994-012-5278-7
dc.subject.keyword Optimal control
dc.subject.keyword Uncontrolled dynamics
dc.subject.keyword Kullback-Leibler divergence
dc.subject.keyword Graphical model
dc.subject.keyword Approximate inference
dc.subject.keyword Cluster variation method
dc.subject.keyword Belief propagation
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/acceptedVersion

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account


Compliant to Partaking