Login [Center] Logout Join Us Guidelines  I  中文  I  CQI

Understanding Modern Deep Learning by Analyzing the Training Dynamics

Speaker: Kaifeng Lyu Princeton University
Time: 2023-10-24 14:00-2023-10-24 15:00
Venue: FIT 1-222

Abstract:

In the last decade, deep learning has made much exciting progress, but it is still largely a black box, mainly because the training dynamics inside are notoriously hard to understand. Different optimizers, or the same optimizer with different hyperparameters, can induce different “implicit biases” toward neural nets that generalize very differently. In this talk, I will present several of my works that quantify implicit bias induced by the interaction between the optimizer and several key components of modern deep neural nets, including feedforward architectures, normalization layers, weight decay, as well as local updates in a distributed environment (“Local SGD”).

Short Bio:

Kaifeng Lyu is a Ph.D. student in the Computer Science Department at Princeton University. He is advised by Prof. Sanjeev Arora. His research interest lies in developing fundamental mathematical foundations of deep learning. Before starting his Ph.D, he did his undergraduate at Yao Class, Tsinghua University and received a B.Eng. in Computer Science and Technology in 2019.