2018
DOI: 10.1109/lssc.2019.2902738
|View full text |Cite
|
Sign up to set email alerts
|

A Scalable Multi-TeraOPS Core for AI Training and Inference

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 29 publications
(19 citation statements)
references
References 6 publications
0
19
0
Order By: Relevance
“…Analysis of machine-learning workloads for both training and inference has shown that high-dimensionality MVMs dominate, accounting for 70%-95% of the computation [22]. This is especially true for the CNNs of interest in target edge applications.…”
Section: Architecture Overview and Rationalementioning
confidence: 99%
See 1 more Smart Citation
“…Analysis of machine-learning workloads for both training and inference has shown that high-dimensionality MVMs dominate, accounting for 70%-95% of the computation [22]. This is especially true for the CNNs of interest in target edge applications.…”
Section: Architecture Overview and Rationalementioning
confidence: 99%
“…These include element-wise operations, such as activation functions, scaling, adding, and offset. [22], as well as other signal-processing operations required in audio, video, and so on, and pipelines [2], [23], [24]. Such computations are distinct from MVMs in that they benefit from significant data locality (i.e., a small number of operands is involved in the fundamental operations even though the operations might be parallelized).…”
Section: Architecture Overview and Rationalementioning
confidence: 99%
“…[114] The 1T1R array is 64 kb (256 Â 256) with RRAM devices in binary states. [115] The spike-based ISNA takes a different approach to enhance the energy efficiency by lowering the power consumption at the cost of %200 ns latency. In the memory mode, the read/write logics, drivers, and amplifiers realize data programming and sensing.…”
Section: Spike-based Designmentioning
confidence: 99%
“…Even BNNs that use binarized weights during inference require floating-point computations for their training [6]. Several studies on training hardware using the backpropagation algorithm have recently been reported [10]- [12]. Training using a hardware accelerator [10], [11] shows insufficient improvement in terms of latency compared to the improvement obtained for inference carried out on the same accelerator.…”
Section: Introductionmentioning
confidence: 99%
“…Several studies on training hardware using the backpropagation algorithm have recently been reported [10]- [12]. Training using a hardware accelerator [10], [11] shows insufficient improvement in terms of latency compared to the improvement obtained for inference carried out on the same accelerator. Moreover, these works require floatingpoint arithmetic during training even though they represent low-precision data during inference.…”
Section: Introductionmentioning
confidence: 99%