Inverse reinforcement learning with linearly-solvable MDPs for multiple
reward functions
Inverse reinforcement learning with linearly-solvable MDPs for multiple reward functions
Enllaç permanent
Descripció
Resum
A subclass of Markov Decision Processes (MDPs), the Linearly solvable Markov Decision Processes (LMDPs), which have discrete state space and continuous control space, allow for a significant simplification of the inverse reinforcement learning problem by eliminating the need to solve the forward problem, and requiring only the unconstrained optimization of a convex and easily computable log-likelihood. This however, has only been explored for the single-reward single-agent scenario, where a single agent is assumed to be imposing optimal control under the influence of a single fixed reward function. In this work, we aim to utilise the advantages in problem formulation and ease of computation for LMDPs, for a multiple-agent, multiple- reward scenario, using non-parametric Bayesian inverse reinforcement learning.Descripció
Treball fi de màster de: Master in Intelligent Interactive Systems. Tutors: Anders Jonsson, Vicenç Gómez, Mario Ceresa