Speculative Super-Optimization: Dynamic Binary Vector Widening

Author: ORCID icon orcid.org/0000-0001-6227-1058
Ward, Conner, Computer Science - School of Engineering and Applied Science, University of Virginia
Venkat, Ashish, EN-Comp Science Dept, University of Virginia

General purpose computer architects are tasked with designing processors that are both fast and useful for a wide variety of program workloads, but it’s often difficult to achieve both. This thesis identifies the vector processing unit of the x86 architecture as a valuable hardware resource that produces substantial performance improvements over scalar code, but also a resource that is often difficult or even impossible to leverage to its fullest. Modern high-performance x86 processors support vector widths up to 512-bits, but this capability is underutilized by legacy vector codes as well as target-flexible vector codes that have been compiled to a smaller vector width. Additionally in heterogeneous architectures, the largest vector widths that are implemented in the high-performance cores are often disabled to allow all cores on the processor to share the same instruction set.

This thesis seeks to address this under-utilization of the vector processing unit by speculatively widening vector instructions. The motivating observation is that vector loops whose memory accesses are strided to the vector width can be dynamically unrolled, whereby any vector instructions can then be fused together into a single instruction that uses a multiple of the original vector width. To explore this idea to its fullest, all the necessary algorithms and structures to perform the analysis, transformation, and speculative issue of dynamically widened vector instructions are implemented in the gem5 microarchitectural simulator for the x86 instruction set architecture. This implementation is realizable in hardware, adds minimal overhead to the processing pipeline, and achieves near best-case performance gains for a specific type of vector loop. While the vector codes that can be safely widened with this method are limited, this work displays the potential for similar highly aggressive speculative binary transformations.

MS (Master of Science)
Microarchitecture, Speculation, Optimization
Issued Date: