Solving Montezuma's Revenge with Planning and Reinforcement Learning
Solving Montezuma's Revenge with Planning and Reinforcement Learning
Enllaç permanent
Descripció
Resum
Traditionally, methods for solving Sequential Decision Processes (SDPs) have not worked well with those that feature sparse feedback. Both planning and reinforcement learning, methods for solving SDPs, have trouble with it. With the rise to prominence of the Arcade Learning Environment (ALE) in the broader research community of sequential decision processes, one SDP featuring sparse feedback has become familiar: the Atari game Montezuma’s Revenge. In this particular game, the great amount of knowledge the human player already possesses, and uses to find rewards, cannot be bridged by blindly exploring in a realistic time. We apply planning and reinforcement learning approaches, combined with domain knowledge, to enable an agent to obtain better scores in this game. We hope that these domain-specific algorithms can inspire better approaches to solve SDPs with sparse feedback in general.Descripció
Treball de fi de grau en informàtica
Tutor: Anders Jonsson