Inverse reinforcement learning with linearly-solvable MDPs for multiple reward functions

Enllaç permanent

Descripció

  • Resum

    A subclass of Markov Decision Processes (MDPs), the Linearly solvable Markov Decision Processes (LMDPs), which have discrete state space and continuous control space, allow for a significant simplification of the inverse reinforcement learning problem by eliminating the need to solve the forward problem, and requiring only the unconstrained optimization of a convex and easily computable log-likelihood. This however, has only been explored for the single-reward single-agent scenario, where a single agent is assumed to be imposing optimal control under the influence of a single fixed reward function. In this work, we aim to utilise the advantages in problem formulation and ease of computation for LMDPs, for a multiple-agent, multiple- reward scenario, using non-parametric Bayesian inverse reinforcement learning.
  • Descripció

    Treball fi de màster de: Master in Intelligent Interactive Systems. Tutors: Anders Jonsson, Vicenç Gómez, Mario Ceresa
  • Mostra el registre complet