PredictiveNet: An energy-efficient convolutional neural network via zero prediction

Lin, Yingyan; Sakr, Charbel; Kim, Yongjune; Shanbhag, Naresh R.

doi:10.1109/iscas.2017.8050797

Cited by 56 publications

(27 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Results are obtained via a energy estimation tool for Deep Neural Networks publicly available in deep neural network energy estimation tool Lin et al proposed PredictiveNet to skip a large fraction of convolutions in CNNs at runtime without modifying the CNN structure or requiring additional branch networks. An analysis supported by simulations is provided to justify how to preserve the mean square error (MSE) of the nonlinear layer outputs.…”

Section: Related Workmentioning

confidence: 99%

Energy‐based tuning of convolutional neural networks on multi‐GPUs

Castro

Guil

Marín-Jiménez

et al. 2018

Concurrency and Computation

View full text Add to dashboard Cite

Deep Learning (DL) applications are gaining momentum in the realm of Artificial Intelligence, particularly after GPUs have demonstrated remarkable skills for accelerating their challenging computational requirements. Within this context, Convolutional Neural Network (CNN) models constitute a representative example of success on a wide set of complex applications, particularly on datasets where the target can be represented through a hierarchy of local features of increasing semantic complexity. In most of the real scenarios, the roadmap to improve results relies on CNN settings involving brute force computation, and researchers have lately proven Nvidia GPUs to be one of the best hardware counterparts for acceleration. Our work complements those findings with an energy study on critical parameters for the deployment of CNNs on flagship image and video applications, ie, object recognition and people identification by gait, respectively. We evaluate energy consumption on four different networks based on the two most popular ones (ResNet/AlexNet), ie, ResNet (167 layers), a 2D CNN (15 layers), a CaffeNet (25 layers), and aResNetIm (94 layers) using batch sizes of 64, 128, and 256, and then correlate those with speed-up and accuracy to determine optimal settings. Experimental results on a multi-GPU server endowed with twin Maxwell and twin Pascal Titan X GPUs demonstrate that energy correlates with performance and that Pascal may have up to 40% gains versus Maxwell. Larger batch sizes extend performance gains and energy savings, but we have to keep an eye on accuracy, which sometimes shows a preference for small batches. We expect this work to provide a preliminary guidance for a wide set of CNN and DL applications in modern HPC times, where the GFLOPS/w ratio constitutes the primary goal.

show abstract

Section: Related Workmentioning

confidence: 99%

Energy‐based tuning of convolutional neural networks on multi‐GPUs

Castro

Guil

Marín-Jiménez

et al. 2018

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…Moreover, DNN-based applications often require not only high accuracy, but also aggressive hardware performance, including high throughput, low latency, and high energy efficiency. As such, there has been intensive research on DNN accelerators in order to take advantage of different hardware platforms, such as FPGAs and ASICs, for improving DNN acceleration efficiency [9,10,11,12,13,14].…”

Section: Introductionmentioning

confidence: 99%

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures

Zhao

Wang

et al. 2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators. However, designing DNN accelerators is non-trivial as it often takes months/years and requires cross-disciplinary knowledge. To enable fast and effective DNN accelerator development, we propose DNN-Chip Predictor, an analytical performance predictor which can accurately predict DNN accelerators' energy, throughput, and latency prior to their actual implementation. Our Predictor features two highlights: (1) its analytical performance formulation of DNN ASIC/FPGA accelerators facilitates fast design space exploration and optimization; and (2) it supports DNN accelerators with different algorithm-to-hardware mapping methods (i.e., dataflows) and hardware architectures. Experiment results based on 2 DNN models and 3 different ASIC/FPGA implementations show that our DNN-Chip Predictor's predicted performance differs from those of chip measurements of FPGA/ASIC implementation by no more than 17.66% when using different DNN models, hardware architectures, and dataflows. We will release code upon acceptance.

show abstract

“…This feature is effective for increasing computation speed and lowering power consumption. Some studies have driven the effective speed of convolution operations beyond the performance of the accelerators itself [16]- [18].…”

Section: Introductionmentioning

confidence: 99%

Low-Power Implementation Techniques for Convolutional Neural Networks using Precise and Active Skipping Methods

Kitayama

Ono

Kishimoto

et al. 2020

IEICE Trans. Electron.

View full text Add to dashboard Cite

Reducing power consumption is crucial for edge devices using convolutional neural network (CNN). The zero-skipping approach for CNNs is a processing technique widely known for its relatively low power consumption and high speed. This approach stops multiplication and accumulation (MAC) when the multiplication results of the input data and weight are zero. However, this technique requires large logic circuits with around 5% overhead, and the average rate of MAC stopping is approximately 30%. In this paper, we propose a precise zero-skipping method that uses input data and simple logic circuits to stop multipliers and accumulators precisely. We also propose an active data-skipping method to further reduce power consumption by slightly degrading recognition accuracy. In this method, each multiplier and accumulator are stopped by using small values (e.g., 1, 2) as input. We implemented single shot multi-box detector 500 (SSD500) network model on a Xilinx ZU9 and applied our proposed techniques. We verified that operations were stopped at a rate of 49.1%, recognition accuracy was degraded by 0.29%, power consumption was reduced from 9.2 to 4.4 W (−52.3%), and circuit overhead was reduced from 5.1 to 2.7% (−45.9%). The proposed techniques were determined to be effective for lowering the power consumption of CNN-based edge devices such as FPGA.

show abstract

PredictiveNet: An energy-efficient convolutional neural network via zero prediction

Cited by 56 publications

References 5 publications

Energy‐based tuning of convolutional neural networks on multi‐GPUs

Energy‐based tuning of convolutional neural networks on multi‐GPUs

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures

Low-Power Implementation Techniques for Convolutional Neural Networks using Precise and Active Skipping Methods

Contact Info

Product

Resources

About