Mitigating Memory Resource Contention in Warehouse Scale Computers

Tang, Lingjia, Computer Science - School of Engineering and Applied Science, University of Virginia
Soffa, Mary, Department of Computer Science, University of Virginia

The class of modern datacenters hosting large-scale Internet services such as web-search, mail, and social networking has gained significant momentum in today’s computing environments. However, these datacenters, recently coined as warehouse scale computers (WSCs), are extremely expensive to construct and operate. Improving software performance and server utilization is key to improving the efficiency and reducing the enormous cost in WSCs.
Modern WSCs are constructed using commodity multicore processors, on which part of the memory subsystem is shared. When multiple applications are co-located on a multicore machine, contention for the shared memory resources, such as caches and memory bandwidth, may occur. This contention can cause severe cross-core performance interference and significantly degrade application performance. Mitigating resource contention is critical for improving application performance. However, despite the wealth of research effort on contention management, little is known about how emerging large-scale web-service applications interact with the shared memory resources on commodity processors and how this contention can be mitigated to improve the performance of these applications.
In addition to performance, mitigating contention is also critical for improving the server utilization in WSCs. As multicore processors with expanding core counts continue to dominate the server market, the overall utilization of WSCs depends heavily on the consolidation of workloads to take advantage of the total computing potential provided by modern processors. However, many of the applications running in WSCs are user-facing, latency-sensitive applications with quality of service (QoS) requirements. These QoS requirements can be violated by the performance interference that can occur when multiple applications are consolidated on a single machine. As a result, the current common practice in WSCs is to disallow the co-location of latency-sensitive applications with other applications. This approach is undesirable as it results in low machine utilization in WSCs and millions of dollars wasted.
This dissertation presents novel compilation and runtime approaches to significantly mitigate contention and thus improve performance, QoS and machine utilization in data- centers. Specifically, the main contributions of this dissertation include: 1) comprehensive investigation and characterization of the impact of memory resource sharing on industry- strength large-scale datacenter workloads, which expose new characteristics and insights contrary to recent literature; 2) the design of a heuristic based system and a runtime system to intelligently map application threads to cores to promote positive resource sharing and mitigate resource contention to improve application performance; and 3) the design of novel compilation techniques and runtime systems that statically and dynamically manipulate applications’ contentious nature to enable the co-location of applications with varying QoS requirements and as a result, greatly improve server utilization in WSCs.

PHD (Doctor of Philosophy)
memory system, datacenter, compiler, QoS
All rights reserved (no additional license for public reuse)
Issued Date: