Optimizing Datacenter Operations with Practical Complexity


May 20, 2013


Baochun Li


University of Toronto


The unprecedented growth of mega datacenters, in which hundreds of thousands of machines are assembled to process a massive amount of data for Internet-scale services, has been driving the evolution of computing. Designing algorithms to optimize datacenter operations is thus imperative. At the same time, the scale of the infrastructure calls for novel approaches to reduce the complexity of the solutions in order to make them practical.

In this talk, I present two stories that, in different ways, resolve the tussle between optimality and practicality in designing algorithms for datacenters. First, for a single datacenter, I present Anchor, a resource management system that effectively allocate server resources to virtual machines. Instead of being optimal, Anchor is designed to be flexible and practical, and uses a unified mechanism to support diverse allocation policies expressed by operators and tenants. It abstracts performance goals as preferences, and uses a novel stable matching algorithm to solve the matching problem efficiently. In the second part of the talk, I will briefly present our study of workload management for multiple datacenters geo-distributed over the wide area, where it is possible to go for both optimality and practicality. We propose to exploit the geographical diversity to reflect the energy and bandwidth price difference at different locations and ISPs, and develop a distributed algorithm to solve the large-scale optimization problem with faster convergence than traditional methods.

This work is a joint work with my former PhD student, Hong Xu, who will be joining the Department of Computer Science, the City University of Hong Kong, this fall.


Baochun Li

Baochun Li received the B.Engr. degree from the Department of Computer Science and Technology, Tsinghua University, China, in 1995 and the M.S. and Ph.D. degrees from the Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, in 1997 and 2000. Since 2000, he has been with the Department of Electrical and Computer Engineering at the University of Toronto, where he is currently a Professor. He holds the Bell Canada Endowed Chair in Computer Engineering since August 2005. His research interests include large-scale distributed systems, cloud computing, peer-to-peer networks, applications of network coding, and wireless networks.

Baochun Li has co-authored more than 250 research papers, with a total of about 8000 citations and an H-index of 49 according to Google Scholar Citations. Dr. Li was the recipient of the IEEE Communications Society Leonard G. Abraham Award in the Field of Communications Systems in 2000. In 2009, he was a recipient of the Multimedia Communications Best Paper Award from the IEEE Communications Society, and a recipient of the University of Toronto McLean Award. He is a member of ACM and a senior member of IEEE.