AI-RISC: Scalable RISC-V Processor for IoT Edge AI Applications
Verma, Vaibhav, Electrical Engineering - School of Engineering and Applied Science, University of Virginia
Stan, Mircea, EN-Elec/Computer Engr Dept, University of Virginia
Artificial intelligence (AI) and machine learning (ML) have emerged as the fastest growing workloads ranging from applications like object detection, natural language processing and facial recognition to self-driving cars. The proliferation of these compute-intensive workloads resulted in numerous hardware accelerators to fill the gap between the performance and energy-efficiency requirements of AI applications and the capabilities of current architectures like CPU and GPU. In most cases, these accelerators are specialized for a particular task, are costly to produce, require special programming tools, and can become obsolete as new ML algorithms are introduced. Hence, there is a requirement for a system-level solution to streamline the integration of different AI accelerators into standard computing and programming stacks. Furthermore, majority of the AI and ML workloads are currently run in cloud data centers due to the lack of efficient hardware devices to process these workloads at the edge of Internet of Things (IoT). This dissertation presents AI-RISC as a solution to bridge these research gaps.
AI-RISC is a scalable processor developed using hardware, ISA (Instruction Set Architecture) and software co-design methodology to extend the open-source RISC-V architecture for accelerating edge AI applications. AI-RISC tightly integrates hardware accelerators as AI functional units (AFU) inside the RISC-V processor pipeline to allow seamless processing of both AI and non-AI tasks on the same hardware. This allows AI accelerators to be integrated in the RISC-V processor pipeline at a fine-granularity and treated as regular functional units during the execution of instructions. This tight integration is especially beneficial for edge devices which perform inference on a small batch size (usually batch size 1) and smaller neural network models compared to cloud AI workloads. AI-RISC also extends the RISC-V ISA with custom instructions which directly target the added AFUs and expose the hardware functionality to software programmers. Additionally, AI-RISC generates a complete software stack including compiler, assembler, linker, simulator and profiler while preserving the high-level programming abstraction offered by popular AI domain-specific languages and frameworks like TensorFlow, PyTorch, MXNet, Keras etc. AI-RISC allows designers to quickly adapt to any hardware, ISA or software changes and enables comprehensive design-space exploration of various available hardware, instruction and software framework options. Thus, AI-RISC addresses current AI/ML workloads, gives the flexibility to hot-swap AFUs when better hardware is available and scales with new instructions as AI algorithms evolve in the future along with defining a standard hardware/software co-design methodology to specialize processor architectures for AI applications.
Detailed evaluation results presented in this dissertation prove that AI-RISC accelerates the processing of vector-matrix multiply (VMM) kernel by 17.63x and of ResNet-8 neural network model from industry standard MLPerf Tiny benchmark by 4.41x compared to RISC-V processor baseline. AI-RISC also outperforms the state-of-the-art Arm Cortex-A72 IoT edge processor by 2.45x on average over the complete MLPerf Tiny inference benchmark. Additionally, AI-RISC provides 3.93x improvement in energy-efficiency over the baseline RISC-V processor and 11.49x energy-efficiency improvement over state-of-the-art Gemmini systolic array accelerator.
PHD (Doctor of Philosophy)
AI-RISC, RISC-V, AI hardware, Instruction extension, AI, ML, Hardware, AI extension, Hardware/Software Co-design, Codesign, ASIP, ASIP Designer, TVM, PIM, processing-in-memory, IoT edge, Edge AI, Edge AI hardware