ReSense: A Unified Framework for Improving Performance and Reliability in Multicore Architectures
Dey, Tanima, Computer Science - School of Engineering and Applied Science, University of Virginia
Davidson, Jack, Department of Computer Science, University of Virginia
Soffa, Mary, Department of Computer Science, University of Virginia
Chip-multiprocessors (CMPs) have become ubiquitous in modern computing and the mainstream architecture for various platforms, including laptops, desktops, and large server machines. As technology scaling continues and more transistors are accommodated on the chip, the number of cores on CMPs is growing, and multi-core machines are scaling up to many-core machines. With this multi-core scaling, two major problems arise: shared-resource contention and soft errors or transient faults. Shared-resource contention can degrade an application's performance significantly, and soft errors increase the probability of incorrect application execution and the production of visible errors. To realize the full potential of multi- and many-core platforms, it is critical to ensure that applications in a workload not only execute efficiently and fast, but also correctly.
In this dissertation, we develop a novel, general, and unified framework, ReSense, to address several challenges on multicore architectures including performance optimization, reliability improvement, power and thermal management. The framework includes five components: a general characterization methodology, a characterization metric, a sensitivity score, a thread mapping algorithm, and a run-time system. An instance of the framework is applied in two phases: characterization and mapping. The characterization phase utilizes the general characterization methodology and characterization metric to identify application characteristics without considering any co-runner(s). It generates a resource-sensitivity score for each application in a workload. In the mapping phase, the run-time system uses a thread-mapping algorithm and the sensitivity scores of the applications in a workload to determine the thread-mappings that optimize the objective function of the targeted problem.
To demonstrate the utility and effectiveness of ReSense, we use it to address the problems of shared-resource contention and soft errors for multi-threaded applications. For the resource contention problem, the characterization methodology determines how a multi-threaded application's performance is affected as it shares a resource in the memory hierarchy. A sensitivity score based on resource contention is produced for each application in a workload. The run-time system uses the resource-contention sensitivity scores and a thread-mapping algorithm to allocate threads from a workload to core to mitigate shared-resource contention, thus improving response time and throughput.
For the soft error problem, the characterization methodology determines how a multi-threaded application's vulnerability to soft errors in shared caches is affected by its resource occupancy duration. A sensitivity score based on cache occupancy is produced for each application in a workload. The run-time system uses the cache-occupancy sensitivity scores and a thread-mapping algorithm to allocate workload threads to cores to reduce the occupancy in the shared caches, thus reducing cache vulnerability.
Both minimizing an application's vulnerability to soft errors and maintaining application performance are critical. The thread-mapping algorithm that ensures better reliability may not ensure better performance. To address this problem, we develop an integrated instance of the framework that combines application characterizations for both contention and vulnerability to determine a trade-off between the performance and reliability improvements.
The dissertation includes a comprehensive evaluation of all three instances, which indicates that the mapping of each application in a dynamic workload according to its solo-characterization is highly effective. For the resource contention instance, response time and throughput was improved up to 30% and 47%, respectively over the native operating system. For the soft error instance, cache vulnerability was reduced up to 70% over the native operating system. The integrated instance was able to achieve various trade-offs between response time and vulnerability reductions.
PHD (Doctor of Philosophy)
contention, resource, memory hierarchy, soft errors, cache, vulnerability, multicore
All rights reserved (no additional license for public reuse)