Deep reinforcement and imitation learning on a GPU

Relatore: Iuri Frosio - NVIDIA

Data: 17-lug-2018 - Ora: 11:30 Sala verde

Data pubblicazione: 13-giu-2018

Dipartimento: Informatica

In this talk, I describe some of our efforts towards the development of computationally effective learning procedure on a GPU, with particular attention to the Reinforcement Learning (RL) and robotics domains. I first describe a hybrid CPU/GPU version of the recently introduced Asynchronous Advantage Actor-Critic (A3C) algorithm. I analyze its computational footprint and highlight the critical aspects to run it effectively on a GPU, by means of a system of queues and a dynamic scheduling strategy, that are potentially helpful for other asynchronous algorithms. Our hybrid CPU/GPU version of A3C achieves a significant speed-up compared to its CPU implementation and it is publicly available. In the second part of the talk, I will introduce CuLE (Cuda Learning Environment), an experimental deep RL companion library developed to overcome the limitations highlighted in the first part of the talk. RL training is in fact dominated by data generation on the CPU. CuLE provides a GPU implementation of ALE (the Atari Learning Environment), a challenging RL benchmark for discrete episodic tasks. CuLE can easily simulate thousands of environments in parallel, whereas traditional deep RL implementations use a limited number of agents with replay memory to achieve training efficiency. CuLE supports new training scenarios with an extremely large number of agents, while minimizing at the same time expensive data movement operations. I conclude the talk showing an example of deep learning on a GPU in the robotic context. I illustrate the advantages of training using a simulator and compare reinforcement learning and imitation learning algorithms. Then I show how the separation of the control and vision modules simplifies and speed up the learning procedure in simulation, although the learned controller hardly generalizes to the real world environment. Finally, I demonstrate how to use domain transfer to deploy the DNN controller trained in simulation for real world applications.

Contact person: Umberto Castellani