Enabling Low-Overhead and Scalable Near-Data Pattern Matching Acceleration with Memory-Centric Architectures
Rahimi, Gholamreza, Computer Engineering - School of Engineering and Applied Science, University of Virginia
Skadron, Kevin, EN-Comp Science Dept, University of Virginia
The growing need for accelerated pattern recognition and inexact pattern matching has motivated many efforts to design finite-automata-based, in-memory pattern-processing accelerators. However, the lack of a standard, scalable, open-source, and easy-to-modify framework has made it difficult to develop new applications and explore new architectural innovations. Moreover, none of the existing in-memory accelerators is designed to process multiple symbols at once. In addition, none has a reliable, efficient, and scalable reporting architecture to gather and analyze the reporting data.
This proposal outlines four novel software and hardware contributions to improve the effectiveness of pattern processing on big-data applications with real-time processing needs. (1) We propose a robust, easy-to-use/modify automata processing simulation, transformation, optimization, and performance modeling framework to facilitate automata application development and architectural explorations/innovations. (2) Motivated by our application analysis enabled by our framework, we observe a significant resource underutilization in the existing memory-based accelerators. We leverage this observation and propose an area-efficient, high-throughput, and energy-efficient in-SRAM architecture for multi-symbol pattern processing. (3) Inspired by our study on multi-symbol pattern matching in in-memory architectures, we explore temporal multi-symbol matching on FPGA platforms. Our framework enables us to process different bitwidths, which has shown to be beneficial for mapping different applications to platforms with different architectural parameters. (4) Finally, we analyze the reporting architecture of the existing state-of-the-art memory-centric solutions, and we find that these reporting architectures are either the major source of the area/performance inefficiency, or they are not scalable and general solutions. To address these issues, we propose a compact, reconfigurable, and easy-to-handle in-situ reporting architecture by re-purposing SRAM subarrays with negligible hardware overhead.
PHD (Doctor of Philosophy)
In-memory processing , Pattern matching, Near-data processing, Automata Processing, Reconfigurable computing