Traditionally, methods for solving Sequential Decision Processes (SDPs) have not
worked well with those that feature sparse feedback. Both planning and reinforcement
learning, methods for solving SDPs, have trouble with it.
With the rise to prominence of the Arcade Learning Environment (ALE) in the
broader research community of sequential decision processes, one SDP featuring
sparse feedback has become familiar: the Atari game Montezuma’s Revenge. In this
particular game, the great amount of ...
Traditionally, methods for solving Sequential Decision Processes (SDPs) have not
worked well with those that feature sparse feedback. Both planning and reinforcement
learning, methods for solving SDPs, have trouble with it.
With the rise to prominence of the Arcade Learning Environment (ALE) in the
broader research community of sequential decision processes, one SDP featuring
sparse feedback has become familiar: the Atari game Montezuma’s Revenge. In this
particular game, the great amount of knowledge the human player already possesses,
and uses to find rewards, cannot be bridged by blindly exploring in a realistic time.
We apply planning and reinforcement learning approaches, combined with domain
knowledge, to enable an agent to obtain better scores in this game.
We hope that these domain-specific algorithms can inspire better approaches to solve
SDPs with sparse feedback in general.
+