Sanchari Sen scite author profile

Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range of machine learning tasks. Satiating the enormous growth in computational demand posed by DNNs is a key challenge for computing system designers and has most commonly been addressed through the design of custom accelerators. However, these specialized accelerators that utilize large quantities of multiply-accumulate units and on-chip memory are prohibitive in many design scenarios (e.g., wearable devices and IoT sensors), due to the stringent area/cost constraints. Therefore, accelerating DNNs on these low-power systems, comprising of mainly the indispensable general-purpose processor (GPP) cores, requires new approaches. In this work, we focus on improving the performance of DNNs on GPPs by exploiting a key attribute of DNNs, i.e. sparsity. We propose Sparsity aware Core Extensions (SPARCE) -a set of micro-architectural and ISA extensions that leverage sparsity and are minimally intrusive and low-overhead. We address the key challenges associated with exploiting sparsity in GPPs, viz., dynamically detecting whether an operand (e.g., the result of a load instruction) is zero and subsequently skipping a set of future instructions that use it. To maximize performance benefits, our design ensures that the instructions to be skipped are prevented from even being fetched, as squashing instructions comes with a penalty (e.g., a pipeline stall). SPARCE consists of 2 key micro-architectural enhancements. First, a Sparsity Register File (SpRF) is utilized to track registers that are zero. Next, a Sparsity aware Skip Address (SASA) table is used to indicate instruction sequences that can be skipped, and to associate specific SpRF registers to trigger instruction skipping. When an instruction is fetched, SPARCE dynamically pre-identifies whether the following instruction(s) can be skipped, and if so appropriately modifies the program counter, thereby skipping the redundant instructions and improving performance. We model SPARCE using the gem5 architectural simulator, and evaluate our approach on 6 state-of-the-art image-recognition DNNs in the context of both training and inference using the Caffe deep learning framework. On a scalar microprocessor, SPARCE achieves 1.11×-1.96× speedups across both convolution and fully-connected layers with 10%-90% sparsity, and 19%-31% reduction in execution time at the overall application-level. We also evaluate SPARCE on a 4-way SIMD ARMv8 processor using the OpenBLAS library, and demonstrate that SPARCE achieves 8%-15% reduction in the application-level execution time.

show abstract

A Programmable Event-driven Architecture for Evaluating Spiking Neural Networks

Venkataramani

Gala

Sen³

et al. 2017

View full text Add to dashboard Cite

RaPiD: AI Accelerator for Ultra-low Precision Training and Inference

Venkataramani

Srinivasan

Wang

et al. 2021

View full text Add to dashboard Cite

Approximate Computing for Long Short Term Memory (LSTM) Neural Networks

Sen

Raghunathan

2018

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sanchari Sen

Approximate computing for spiking neural networks

SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks

A Programmable Event-driven Architecture for Evaluating Spiking Neural Networks

RaPiD: AI Accelerator for Ultra-low Precision Training and Inference

Approximate Computing for Long Short Term Memory (LSTM) Neural Networks

Contact Info

Product

Resources

About