Despite the tremendous successes by deep reinforcement learning (DRL), one critical issue for existing DRL works is generalization. A DRL agent is typically evaluated in the same environment as where it was trained. Therefore, the learned policy can be extremely specialized to the training scenarios and easily fail when the agent is tested in a new environment. In contrast, humans have the ability to adapt to new environments easily without further training. This generalization issue indicates a fundamental challenge towards bringing learning agents from lab to the real world.
This talk presents several recent progresses on this challenge by enabling the DRL agents to have two crucial capabilities: (1) the ability of performing long-term planning instead of merely memoizing the training experiences; (2) the ability of utilizing prior knowledge of the real world to derive better plans. We will show the agents with these planning capabilities generalize significantly better than classical DRL agents on a variety of challenging tasks.
Yi Wu is now a 5-th year Ph.D. candidate at UC Berkeley advised by Prof. Stuart Russell. He received his B.E. from the special pilot class (Yao class) from Institute for Interdisciplinary Information Sciences, Tsinghua University. Yi's research focuses on how to effectively incorporate human knowledge into AI models to produce both interpretable and generalizable solutions. He is now working on a variety of projects, including deep reinforcement learning, natural language processing and probabilistic programming.