Sample efficiency is increasingly vital in ML and AI, as we aim for higher accuracies with larger and deeper models. The first part of this talk concerns about understanding the somewhat surprising phenomenon in deep learning that over-parametrized models can often generalize better than smaller ones. We provide two complementary perspectives: both the algorithms and the losses can implicitly encourage the solution to have the lowest possible complexity given it fits the training data. We will also show some experimental improvements that are partially inspired by the theory.
Then, we will discuss improving the sample efficiency in RL with model-based approaches. We propose an algorithmic framework for designing model-based deep RL algorithms with theoretical guarantees. Instantiating our meta-algorithm with simplification gives a variant of model-based RL algorithms Stochastic Lower Bounds Optimization (SLBO), which achieves the state-of-the-art performance when only 1M or fewer samples are permitted on a range of continuous control benchmark tasks.
Tengyu Ma is an assistant professor at Stanford University. He received his Ph.D. from Princeton University and B.E. from Tsinghua University. His research interests include topics in machine learning and algorithms, such as non-convex optimization, deep learning, representation learning, reinforcement learning, and high-dimensional statistics. He is a recipient of NIPS'16 best student paper award and COLT'18 best paper award.