Efficient deep learning computing requires algorithm and hardware co-design to enable specialization. However, the extra degree of freedom creates a much larger design space. Human engineers can hardly exhaust the design space by heuristics, and there’s a shortage of machine learning engineers. We propose techniques to architect efficient neural networks efficiently and automatically. We first introduce Deep Compression (ICLR’16) techniques to reduce the size of neural networks, followed by EIE accelerator (ISCA’16) that directly accelerate a sparse and compressed model. Then investigate automatically designing small and fast models (ProxylessNAS, ICLR’19), auto channel pruning (AMC, ECCV’18), and auto mixed-precision quantization (HAQ, CVPR’19). We demonstrate such learning-based, automated design achieves superior performance and efficiency than rule-based human design. Finally, we accelerate computation-intensive AI applications including TSM (ICCV’19) for efficient video recognition and PVCNN (NeurIPS’19) for efficient 3D point cloud recognition.
Song Han is an assistant professor at MIT EECS. Dr. Han received the Ph.D. degree in Electrical Engineering from Stanford University and B.S. degree in Electrical Engineering from Tsinghua University. Dr. Han's research focuses on efficient deep learning computing. He proposed "Deep Compression" and "EIE Accelerator" that impacted the industry. His work received the best paper award in ICLR'16 and FPGA’17. He was the co-founder and chief scientist of DeePhi Tech acquired by Xilinx. Dr. Han is listed by MIT Technology Review's 35 Innovators Under 35.