Paolo Costa, Microsoft Research
Title: Towards Rack-scale Computing: Challenges and Opportunities (slides)
Abstract: New hardware technology such as systems- and networks-on-chip (SOCs and NOCs), switchless network fabrics, silicon photonics, and RDMA, are redefining the landscape of data center computing, enabling interconnecting thousands of cores at high speed at the scale of today’s racks. We refer to this new class of hardware as rack-scale computers because the rack is increasingly replacing the individual server as the basic building block of modern data centers. Most of the benefits promised by these new architectures, however, can only be achieved with adequate support from the software stack. In this talk, I will describe some of the research challenges and the opportunities introduced by rack-scale computing. As a concrete example, I will provide an overview of some of the research projects related to this topic that we are pursuing in our group.
Gustavo Alonso, ETH Zurich
Title: Rackscale – the things that matter (slides)
Abstract: Rackscale computing has become the standard for many applications running on a data center. For a variety of reasons, today it is possible to developed fully customized solution that achieve impressive performance numbers. In this talk I will argue that customization is important but needs to be sustained by general purpose techniques and components. The research agenda in the next years should focus on the latter, rather than on producing an infinity variety of high performance systems tailored for narrow use cases. Otherwise, the inevitable problems with total cost of ownership during the life cycle of a real systems (maintenance, further development, software evolution, additional functionality) will soon catch up with many existing proposals.
Tim Harris, Oracle Labs
Title: What We Talk About When We Talk About Scheduling (slides)
Abstract: Distributed workloads involve scheduling and resource allocation decisions at multiple levels of the stack: deciding which machines a job will run on; deciding which instances of replicated services they will use; arbitrating within those services between multiple clients; deciding which VM’s virtual CPUs get which physical cores, and which threads run on those CPUs; deciding how the instructions in those threads are scheduled in multi-threaded processors, and how they compete for resources in the cores and in the interconnect. In this talk I will illustrate the kind of interference that can occur at these different levels, and talk about some possible approaches for handling these problems.
Sanjeev Kumar, Facebook
Title: Efficiency at Scale (slides)