Inverse reinforcement learning with linearly-solvable MDPs for multiple reward functions

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Deb, Ahana
  • dc.date.accessioned 2023-10-09T13:40:26Z
  • dc.date.available 2023-10-09T13:40:26Z
  • dc.date.issued 2023-10-09
  • dc.description Treball fi de màster de: Master in Intelligent Interactive Systems. Tutors: Anders Jonsson, Vicenç Gómez, Mario Ceresaca
  • dc.description.abstract A subclass of Markov Decision Processes (MDPs), the Linearly solvable Markov Decision Processes (LMDPs), which have discrete state space and continuous control space, allow for a significant simplification of the inverse reinforcement learning problem by eliminating the need to solve the forward problem, and requiring only the unconstrained optimization of a convex and easily computable log-likelihood. This however, has only been explored for the single-reward single-agent scenario, where a single agent is assumed to be imposing optimal control under the influence of a single fixed reward function. In this work, we aim to utilise the advantages in problem formulation and ease of computation for LMDPs, for a multiple-agent, multiple- reward scenario, using non-parametric Bayesian inverse reinforcement learning.ca
  • dc.format.mimetype application/pdf*
  • dc.identifier.uri http://hdl.handle.net/10230/58063
  • dc.language.iso engca
  • dc.rights Attribution-NonCommercial- NoDerivs 3.0 Spainca
  • dc.rights.accessRights info:eu-repo/semantics/openAccessca
  • dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/3.0/es/ca
  • dc.subject.keyword Linearly solvable Markov Decision Process
  • dc.subject.keyword Inverse Reinforcement Learning
  • dc.subject.keyword Multiple Rewards
  • dc.subject.keyword Non-parametric Bayesian Learning
  • dc.subject.other Linearly solvable Markov Decision Process Inverse Reinforcement Learn- Ing Multiple Rewards Non-parametric Bayesian Learningca
  • dc.title Inverse reinforcement learning with linearly-solvable MDPs for multiple reward functionsca
  • dc.type info:eu-repo/semantics/masterThesisca