Deep learning, as the name implies, had traditionally focused upon increasing model complexity and representation power through depth, adding multiple layers to a network to increase its expressiveness. In this work, we argue for a different way of viewing "deep" models, through the lens of equilibria and dynamical systems. Specifically, we introduce the Deep Equilibrium (DEQ) model, which works by directly computing an equilibrium point of a non-linear dynamical system, an example of a so-called implicit layer. We show how this model can be trained via backpropagation using implicit differentiation, and discuss the representational power of such networks. Finally, we illustrate that on large-scale sequence modeling tasks, the method achieves state-of-the-art performance, despite having only a single "layer" and while vastly improving memory efficiency over previous approaches.
Zico Kolter is an Associate Professor in the Computer Science Department at Carnegie Mellon University, and also serves as chief scientist of AI research for the Bosch Center for Artificial Intelligence. His work focuses on the intersection of machine learning and optimization, with a large focus on developing more robust, interpretable, and rigorous methods in deep learning. In addition, he has worked in a number of application areas, highlighted by work on sustainability and smart energy systems. He is a recipient of the DARPA Young Faculty Award, and best paper awards at ICML (honorable mention), KDD, PESGM, and IJCAI.