The class of modern datacenters hosting large-scale Internet services such as web-search, mail, and social networking has gained significant momentum in today’s computing environments. However, these datacenters, recently coined as warehouse scale computers (WSCs), are extremely expensive to construct and operate. Improving software performance and server utilization is key to improving the efficiency and reducing the enormous cost in WSCs.
Modern WSCs are constructed using commodity multicore processors, on which part of the memory subsystem is shared. When multiple applications are co-located on a multicore machine, contention for the shared memory resources, such as caches and memory bandwidth, may occur. This contention can cause severe cross-core performance interference, and significantly de- grade application performance. Mitigating resource contention is critical for improving application performance. However, despite the wealth of research effort on contention management, little is known about how emerging large- scale web-service applications interact with the shared memory resources on commodity processors, and how this contention can be mitigated to improve the performance of these applications.
In addition to performance, mitigating contention is also critical for im- proving the server utilization in WSCs. As multicore processors with expanding core counts continue to dominate the server market, the overall utilization of WSCs depends heavily on the consolidation of workloads to take advantage of the total computing potential provided by modern processors. However, many of the applications running in WSCs are user-facing, latency-sensitive applications with quality of service (QoS) requirements. These QoS requirements can be violated by the performance interference that can occur when multiple applications are consolidated on a single ma- chine. As a result, the current common practice in WSCs is to disallow the co-location of latency-sensitive applications with other applications. This approach is undesirable as it results in low machine utilization in WSCs and millions of dollars wasted.
In this talk I present novel compilation and runtime approaches to significantly mitigating contention and improving performance, QoS and machine utilization in datacenters. Specifically, this talk presents: 1) comprehensive investigation and characterization of the impact of memory resource sharing on industry-strength large-scale datacenter workloads, which expose new characteristics and insights contrary to recent literature; 2) the design of a heuristic based system and a runtime system to intelligently map application threads to cores to promote positive resource sharing and mitigate resource contention to improve application performance; and 3) the design of novel compilation techniques and run- time systems that statically and dynamically manipulate applications’ contentious nature to enable the co-location of applications with varying QoS requirements, and as a result, greatly improve server utilization in WSCs.