State representation learning for goal-conditioned reinforcement learning

Citació

  • Steccanella L, Jonsson A. State representation learning for goal-conditioned reinforcement learning. In: Amini MR, Canu S, Fischer A, Guns T, Kralj Novak P, Tsoumakas G, editors. Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part IV; 2022 Sep 19-23; Grenoble, France. Cham: Springer; 2023. p. 84-99. DOI: 10.1007/978-3-031-26412-2_6

Enllaç permanent

Descripció

  • Resum

    This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.
  • Descripció

    Comunicació presentada a European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2022), celebrat del 19 al 23 de setembre de 2022 a Grenoble, França.
  • Mostra el registre complet