Login [Center] Logout Join Us Guidelines  I  中文  I  CQI

Fill but do not spill: achieving efficiency and robustness simultaneously in clo

Speaker: Dr. Hongqiang Liu Microsoft Research, Redmond Lab
Time: 2016-03-10 14:00-2016-03-10 15:00
Venue: FIT 1-222


Cloud computing services are facing a tremendous growth, and thus their infrastructures are under a huge pressure to onboard more workloads quickly with service-level agreements (SLAs). Since expanding the infrastructures is complex, slow and costly, the key to relieving the tense between the rapidly increasing customer demands and slowly ramping-up infrastructures is to highly utilize the existing resources and strongly protect application performance simultaneously.

In this talk, I will present two projects which help cloud providers to find a good balance between efficiency and robustness in their infrastructures. First, in network traffic management, we propose the concept of “Forward Fault Correction (FFC)” which proactively prevents a network from congestion caused by faults like configuration, link and device failures. FFC requires a traffic engineering (TE) to guarantee no congestion without reconfiguring the network as long as the number of faults is under k. The challenges to realize FFC lie in the overhead in network throughput and the computational complexity to prepare for a huge number of fault cases. We develop an efficient and uniform method to obtain a TE with FFC under diverse kinds of faults.

Next, I will present the design of the next generation of Microsoft’s online service delivery infrastructure, called Footprint. Footprint jointly coordinates all key routing and resource allocation decisions, to achieve high efficiency and low risk. It decides how to map users to frontend proxies, proxies to backend datacenters, and traffic to network paths, and configures all infrastructure components involved in service delivery, including network switches, proxies, load balancers, and DNS servers to achieve this mapping. We show that fully realizing the potential of the joint control on infrastructure requires faithful modeling of system dynamics. A major issue is that after we change system configuration, its impact is not immediate but manifests only gradually. To capture temporal variations, we model system load and performance as a function of time. Solving time-based models can be intractable (e.g., time is continuous), but we show how all load and performance constraints can be met by considering a small number of time points.

Short Bio:

Hongqiang (Harry) Liu is a Postdoctoral Researcher in Microsoft Research, Redmond Lab. He received his Ph.D. degree from the Department of Computer Science at Yale University in 2014, and his advisor is Prof. David Gelernter. Before joining Yale, he received his Master's and Bachelor's degrees from the Department of Electronic Engineering, Tsinghua University, Beijing. His research interest lies on many fields of networking and cloud computing, including software-defined networking (SDN), network function virtualization (NFV), and content delivery networks (CDN), edge computing, and network technologies for BigData, Linux container and virtual reality. Dr. Liu has published several papers in top-tier academic conferences, such as ACM SIGCOMM, USENIX NSDI and USENIX ATC. He is a recipient of the prestigious ACM SIGCOMM Doctoral Dissertation Award - Honorable Mention (2015) and Cascadia Innovation Fellowship (2011).