This thesis explores the integration of imitation learning and policy representation within the domain of constrained reinforcement learning (CRL) to enhance decision making in environments with stringent limitations. Reinforcement Learning (RL) is a machine learning paradigm focused on training agents to make decisions by
maximizing cumulative rewards. However, real-world scenarios often require additional constraints, such as safety and regulatory requirements, necessitating the use of CRL to ...
This thesis explores the integration of imitation learning and policy representation within the domain of constrained reinforcement learning (CRL) to enhance decision making in environments with stringent limitations. Reinforcement Learning (RL) is a machine learning paradigm focused on training agents to make decisions by
maximizing cumulative rewards. However, real-world scenarios often require additional constraints, such as safety and regulatory requirements, necessitating the use of CRL to ensure these constraints are respected while optimizing performance.
The research addresses the challenges of solving CRL problems using Generative Adversarial Imitation Learning (GAIL) within the framework of Constrained Markov Decision Processes (CMDPs). CMDPs provide a mathematical structure that incorporates constraints into the RL process. The methodology involves two key phases:
the first phase uses the state-augmented CRL algorithm to obtain policies that satisfy the constraints in an augmented space, incorporating dual variables. The second phase refines these policies using GAIL to map them back to the original state space,
leveraging imitation learning techniques to ensure robust performance.
Numerical results from simulations demonstrate the effectiveness of this approach in achieving constraint satisfaction while maintaining high performance. The findings indicate that the integration of CRL with imitation learning can lead to significant improvements in policy robustness and compliance with constraints. This research
contributes to the broader field of machine learning by providing new insights and methods for developing constrained intelligent systems.
+