Heuristic search for generalized stochastic shortest path MDPs
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Kolobov, Andrey
- dc.contributor.author Mausam
- dc.contributor.author Weld, Daniel S.
- dc.contributor.author Geffner, Héctor
- dc.date.accessioned 2019-05-30T08:41:19Z
- dc.date.available 2019-05-30T08:41:19Z
- dc.date.issued 2011
- dc.description Comunicació presentada a: ICAPS 2011 celebrat de l'11 al 16 de juny de 2011 a Freiburg, Alemanya.
- dc.description.abstract Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs). These are MDPs with 1) an optimal value function V ∗ that is the unique solution of Bellman equation and 2) optimal policies that are the greedy policies w.r.t. V ∗ . This paper’s main contribution is the description of a new class of MDPs, that have well-defined optimal solutions that do not comply with either 1 or 2 above. We call our new class Generalized Stochastic Shortest Path (GSSP) problems. GSSP allows more general reward structure than SSP and subsumes several established MDP types including SSP, positive-bounded, negative, and discounted-reward models. While existing efficient heuristic search algorithms like LAO∗ and LRTDP are not guaranteed to converge to the optimal value function for GSSPs, we present a new heuristic-search-based family of algorithms, FRET (Find, Revise, Eliminate Traps). A preliminary empirical evaluation shows that FRET solves GSSPs much more efficiently than Value Iteration.
- dc.description.sponsorship This work was supported by WRF/TJ Cable Professorship and the following grants: ONR N000140910051, NSF IIS-1016465, TIN2009-10232 (MICINN, Spain).
- dc.format.mimetype application/pdf
- dc.identifier.citation Kolobov A, Mausam, Weld DS, Geffner H. Heuristic search for generalized stochastic shortest path MDPs. In: Proceedings of the Twenty-First International Conference on Automated Planning and Scheduling (ICAPS 2011): 2011 Jun 11-16; Freiburg, Germany. Palo Alto: AAAI Press; 2011. p. 130-7.
- dc.identifier.uri http://hdl.handle.net/10230/41663
- dc.language.iso eng
- dc.publisher Association for the Advancement of Artificial Intelligence (AAAI)
- dc.relation.ispartof Proceedings of the Twenty-First International Conference on Automated Planning and Scheduling (ICAPS 2011): 2011 Jun 11-16; Freiburg, Germany. Palo Alto: AAAI Press; 2011.
- dc.relation.projectID info:eu-repo/grantAgreement/ES/3PN/TIN2009-10232
- dc.rights © 2011, Association for the Advancement of Artificial Intelligence (www.aaai.org)
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.title Heuristic search for generalized stochastic shortest path MDPs
- dc.type info:eu-repo/semantics/conferenceObject
- dc.type.version info:eu-repo/semantics/acceptedVersion