2018 IEEE Symposium on VLSI Circuits 2018
DOI: 10.1109/vlsic.2018.8502276
|View full text |Cite
|
Sign up to set email alerts
|

A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
42
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 97 publications
(42 citation statements)
references
References 0 publications
0
42
0
Order By: Relevance
“…At the same time, new opportunities for hardware security will also be provided by approximate computing (e.g., inexact hardware has been used for hiding security information and authentication for resourceconstrained Internet-of-Things (IoT) devices). In the coming decade, it is anticipated that approximate computing techniques will likely be employed in energy-efficient systems for applications such as AI and signal processing; for example, IBM fabricated an AI accelerator chip to achieve multi-tera operations per second (TOPS) performance by applying multiple approximate techniques (including hardware, architecture, algorithms, and programs) [21]. Furthermore, new approximate computing applications will be exploited.…”
Section: Reliability and Securitymentioning
confidence: 99%
“…At the same time, new opportunities for hardware security will also be provided by approximate computing (e.g., inexact hardware has been used for hiding security information and authentication for resourceconstrained Internet-of-Things (IoT) devices). In the coming decade, it is anticipated that approximate computing techniques will likely be employed in energy-efficient systems for applications such as AI and signal processing; for example, IBM fabricated an AI accelerator chip to achieve multi-tera operations per second (TOPS) performance by applying multiple approximate techniques (including hardware, architecture, algorithms, and programs) [21]. Furthermore, new approximate computing applications will be exploited.…”
Section: Reliability and Securitymentioning
confidence: 99%
“…The first foundational progress in compute efficiency for AI model training and inference has been made by exploiting the statistical and approximate nature of deep learning algorithms, leading to reduced-precision digital approaches [7] [8]. Floating-point multiplication dominates deep learning training, and multiplier complexity is quadratic with operand size; e.g., 16-bit precision engines are more than 4x smaller than 32-bit precision engines.…”
Section: Neurons: Biology + Informationmentioning
confidence: 99%
“…x i is the input of the node i. Figure 5 (A) [14] shows DL algorithms are comprised of a spectrum of operations. Although matrix multiplication (gemm) is dominant, optimizing performance efficiency while maintaining accuracy requires the core architecture to efficiently support all of the auxiliary functions.…”
Section: Deep Learningmentioning
confidence: 99%