Solving Montezuma's Revenge with Planning and Reinforcement Learning

Enllaç permanent

Descripció

  • Resum

    Traditionally, methods for solving Sequential Decision Processes (SDPs) have not worked well with those that feature sparse feedback. Both planning and reinforcement learning, methods for solving SDPs, have trouble with it. With the rise to prominence of the Arcade Learning Environment (ALE) in the broader research community of sequential decision processes, one SDP featuring sparse feedback has become familiar: the Atari game Montezuma’s Revenge. In this particular game, the great amount of knowledge the human player already possesses, and uses to find rewards, cannot be bridged by blindly exploring in a realistic time. We apply planning and reinforcement learning approaches, combined with domain knowledge, to enable an agent to obtain better scores in this game. We hope that these domain-specific algorithms can inspire better approaches to solve SDPs with sparse feedback in general.
  • Descripció

    Treball de fi de grau en informàtica
    Tutor: Anders Jonsson
  • Mostra el registre complet