Stochastic gradient descent (SGD) is an immensely popular approach for online learning in settings where data arrives in a stream or data sizes are very large. However, despite an ever-increasing volume of works on SGD, much less is known about the statistical inferential properties of predictions based on SGD solutions. In this talk, we introduce a novel procedure termedHiGrad to conduct statistical inference for online learning, without incurring additional computational cost compared with the vanilla SGD. The HiGrad procedure begins by performing SGD iterations for a while and then split the single thread into a few, and this procedure hierarchically operates in this fashion along each thread. With predictions provided by multiple threads in place, a t-based confidence interval is constructed by decorrelating predictions
Weijie Su is an Assistant Professor of Statistics in the Department of Statistics at the Wharton School, University of Pennsylvania. Prior to joining Penn in Summer 2016, Su obtained his Ph.D. in Statistics from Stanford University in 2016, under the supervision of Emmanuel Candès. He received his bachelor's degree in Mathematics from Peking University in 2011. He is the recipient of the inaugural Theodore Anderson dissertation award by Stanford and won a gold medal from the inaugural S.T. Yau college student mathematics contests. Su's research interests are in statistical machine learning, high-dimensional inference, multiple testing, and privacy-preserving data analysis.