Cross-Entropy method for Kullback-Leibler control in multi-agent systems

Cabrero Daniel, Beatriz

Cross-Entropy method for Kullback-Leibler control in multi-agent systems

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Cabrero Daniel, Beatrizca
dc.date.accessioned 2017-10-27T10:32:43Z
dc.date.available 2017-10-27T10:32:43Z
dc.date.issued 2017-07
dc.description Supervisor: Dr. Vicenç Gómez Cerdà; Co-Supervisor: Dr. Mario Ceresa
dc.description Treball fi de màster de: Master in Intelligent Interactive Systems
dc.description.abstract We consider the problem of computing optimal control policies in large-scale multiagent systems, for which the standard approach via the Bellman equation is intractable. Our formulation is based on the Kullback-Leibler control framework, also known as Linearly-Solvable Markov Decision Problems. In this setting, adaptive importance sampling methods have been derived that, when combined with function approximation, can be effective for high-dimensional systems. Our approach iteratively learns an importance sampler from which the optimal control can be extracted and requires to simulate and reweight agents’ trajectories in the world multiple times. We illustrate our approach through a modified version of the popular stag-hunt game; in this scenario, there is a multiplicity of optimal policies depending on the “temperature” parameter of the environment. The system is built inside Pandora, a multi-agent-based modeling framework and toolbox for parallelization, freeing us from dealing with memory management when running multiple simulations. By using function approximation and assuming some particular factorization of the system dynamics, we are able to scale-up our method to problems with M = 12 agents moving in two-dimensional grids of size N = 21×21, improving on existing methods that perform approximate inference on a temporal probabilistic graphical model.en
dc.format.mimetype application/pdfca
dc.identifier.uri http://hdl.handle.net/10230/33109
dc.language.iso engca
dc.rights Atribución-NoComercial-SinDerivadas 3.0 Españaca
dc.rights.accessRights info:eu-repo/semantics/openAccessca
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/ca
dc.subject.keyword Agent-based system
dc.subject.keyword Function approximation
dc.subject.keyword Kullback-Leibler divergence
dc.subject.keyword Optimal control
dc.subject.keyword Parallel programming
dc.subject.other Sistemes multiagent
dc.subject.other Processos de Markov
dc.title Cross-Entropy method for Kullback-Leibler control in multi-agent systemsca
dc.type info:eu-repo/semantics/masterThesisca

Col·leccions

Master in Intelligent Interactive Systems. Master Thesis projects