In this talk, we study the generalization behaviors of deep networks at interpolation region, where the training error and the gradient at each training data point is small. We use a teacher-student setting and prove that at interpolation region, weight alignment could happen between teacher and student networks at the lowest layer. Furthermore, from the proof, over-parameterization makes such alignment more likely to happen. Further analysis of the training dynamics shows that student network learns the strong teacher nodes first, leaving weak teacher node unexplained until late stage of the training, and over-parameterization can help cover more weaker node with the same number of iterations. This sheds light on the puzzling phenomena that low training error and over-parameterization could lead to good generalization.
Yuandong Tian is a Research Scientist and Manager in Facebook AI Research, working on deep reinforcement learning and its applications, and theoretical analysis of deep models. He is the lead scientist and engineer for ELF OpenGo and DarkForest Go project. Prior to that, he was a researcher and engineer in Google Self-driving Car team in 2013-2014. He received Ph.D in Robotics Institute, Carnegie Mellon University on 2013, Bachelor and Master degree of Computer Science in Shanghai Jiao Tong University. He is the recipient of 2013 ICCV Marr Prize Honorable Mentions.