The unprecedented growth of mega datacenters, in which hundreds of thousands of machines are assembled to process a massive amount of data for Internet-scale services, has been driving the evolution of computing. Designing algorithms to optimize datacenter operations is thus imperative. At the same time, the scale of the infrastructure calls for novel approaches to reduce the complexity of the solutions in order to make them practical.
In this talk, I present two stories that, in different ways, resolve the tussle between optimality and practicality in designing algorithms for datacenters. First, for a single datacenter, I present Anchor, a resource management system that effectively allocates server resources to virtual machines. Instead of being optimal, Anchor is designed to be flexible and practical, and uses a unified mechanism to support diverse allocation policies expressed by operators and tenants. It abstracts performance goals as preferences, and uses a novel stable matching algorithm to solve the matching problem efficiently. In the second part of the talk, I will cover my study of workload management for multiple datacenters distributed over the wide area, where it is possible to go for both optimality and practicality. I propose a temperature aware approach, where requests can be directed to cooler locations with better cooling efficiency to reduce the cooling energy consumption. A novel distributed algorithm is developed to solve the large-scale optimization problem with faster convergence compared to traditional methods.