2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) 2016
DOI: 10.1109/isca.2016.42
|View full text |Cite
|
Sign up to set email alerts
|

Cambricon: An Instruction Set Architecture for Neural Networks

Abstract: Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerator… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
113
0
4

Year Published

2017
2017
2019
2019

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 146 publications
(117 citation statements)
references
References 40 publications
0
113
0
4
Order By: Relevance
“…Instruction set. Previous SIMD works usually devised load instructions for parameters and adopted medium-grained operands for features, such as vector/matrix [37], 2D tile [49], and compute tile [47], for providing flexibility. In contrast, we apply a parameterinside approach and large-grained feature operands for optimizing power consumption and computing capability for highly-parallel convolution.…”
Section: Related Workmentioning
confidence: 99%
“…Instruction set. Previous SIMD works usually devised load instructions for parameters and adopted medium-grained operands for features, such as vector/matrix [37], 2D tile [49], and compute tile [47], for providing flexibility. In contrast, we apply a parameterinside approach and large-grained feature operands for optimizing power consumption and computing capability for highly-parallel convolution.…”
Section: Related Workmentioning
confidence: 99%
“…Eyeriss [6] and ShiDianNao [9] improve the NFU dataflow to maximize operand reuse. A number of other digital designs [16], [20], [10] have also emerged in the past year. Analog Accelerators.…”
Section: B the Landscape Of Cnn Acceleratorsmentioning
confidence: 99%
“…In addition, FT2000 and CEVA-XM6 are vector processors that includes vector processing unit and scalar processing unit, the main difference is that CEVA-XM6 is designed for accelerating only matrix convolution, while FT2000 is optimized by improvement of the algorithm. The similarity between FT2000 and Cambricon [22] are that they are all programmable by instruction set, using instruction set can quickly realize different kinds of neural network, except that Cambricon only simulates and does not have a slice. FT2000 and TPU Table 1 Comparison of parameters between FT2000 and current mainstream neural network accelerators are similar in their architecture, except that FT2000 is a general-purpose neural network accelerator, while TPU only supports CNN, LSTM, and MLP.…”
Section: Comparison Of Ft2000 With Other Processor Architecturesmentioning
confidence: 99%
“…The computing time of convolutional layers accounts for about 85% of the total model [22], thus accelerating convolution calculation in CNN becomes a hotspot in current neural network acceleration. As the convolution calculation is mainly carried out by a large input feature map and a small convolutional kernel, the convolutional kernel is smaller in size, such as 1 × 1, 3 × 3, 5 × 5, whereas the input feature map has a larger scale, such as 224 × 224 × 3 in GoogLeNet.…”
Section: Data Layout Analysismentioning
confidence: 99%