Papini, Matteo; Pirotta, Matteo; Restelli, Marcello
(Springer, 2022)
Policy gradient (PG) algorithms are among the best candidates for the much-anticipated applications of reinforcement learning to real-world control tasks, such as robotics. However, the trial-and-error nature of these ...