清华大学交叉信息研究院

Understanding Modern Deep Learning by Analyzing the Training Dynamics

演讲人： Kaifeng Lyu Princeton University
时间： 2023-10-24 14:00-2023-10-24 15:00
地点：FIT 1-222
内容：

In the last decade, deep learning has made much exciting progress, but it is still largely a black box, mainly because the training dynamics inside are notoriously hard to understand. Different optimizers, or the same optimizer with different hyperparameters, can induce different “implicit biases” toward neural nets that generalize very differently. In this talk, I will present several of my works that quantify implicit bias induced by the interaction between the optimizer and several key components of modern deep neural nets, including feedforward architectures, normalization layers, weight decay, as well as local updates in a distributed environment (“Local SGD”).

个人简介:

Kaifeng Lyu is a Ph.D. student in the Computer Science Department at Princeton University. He is advised by Prof. Sanjeev Arora. His research interest lies in developing fundamental mathematical foundations of deep learning. Before starting his Ph.D, he did his undergraduate at Yao Class, Tsinghua University and received a B.Eng. in Computer Science and Technology in 2019.