2017
DOI: 10.2200/s00783ed1v01y201706cac041
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning for Computer Architects

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0
3

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 79 publications
0
8
0
3
Order By: Relevance
“…CNN Hardware Accelerators. There is currently huge research interest in the design of high-performance and energy-efficient neural network hardware accelerators, both in academia and industry (Barry et al, 2015;Arm;Nvidia;Reagen et al, 2017a). Some of the key topics that have been studied to date include dataflows (Chen et al, 2016b;Samajdar et al, 2018), optimized data precision (Reagen et al, 2016), systolic arrays (Jouppi et al, 2017), sparse data compression and compute (Han et al, 2016;Albericio et al, 2016;Parashar et al, 2017;Yu et al, 2017;Ding et al, 2017;Whatmough et al, 2018), bit-serial arithmetic (Judd et al, 2016), and analog/mixed-signal hardware (Chen et al, 2016a;LiKamWa et al, 2016;Shafiee et al, 2016;Chi et al, 2016;Kim et al, 2016;Song et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…CNN Hardware Accelerators. There is currently huge research interest in the design of high-performance and energy-efficient neural network hardware accelerators, both in academia and industry (Barry et al, 2015;Arm;Nvidia;Reagen et al, 2017a). Some of the key topics that have been studied to date include dataflows (Chen et al, 2016b;Samajdar et al, 2018), optimized data precision (Reagen et al, 2016), systolic arrays (Jouppi et al, 2017), sparse data compression and compute (Han et al, 2016;Albericio et al, 2016;Parashar et al, 2017;Yu et al, 2017;Ding et al, 2017;Whatmough et al, 2018), bit-serial arithmetic (Judd et al, 2016), and analog/mixed-signal hardware (Chen et al, 2016a;LiKamWa et al, 2016;Shafiee et al, 2016;Chi et al, 2016;Kim et al, 2016;Song et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…Neural Network Accelerator We develop a systolic arraybased CNN accelerator and integrate it into our evaluation infrastructure. The design is reminiscent of the Google Tensor Processing Unit (TPU) [78], but is much smaller, as befits the mobile budget [97].…”
Section: Hardware Setupmentioning
confidence: 99%
“…TCUs come under the guise of different marketing terms, be it NVIDIA's Tensor Cores [18], Google's Tensor Processing Unit [10], Intel KNL's AVX extensions [76], Apple A11's Neural Engine [2], or ARM's Machine Learning Processor [3]. TCUs are designed to accelerate Multilayer Perceptrons (MLP), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), or Deep Neural Network (DNN) in general TCUs vary in implementation [18,36,40,43,48,54,71,74,75,76,79,87], and are prevalent [1,4,8,9,10,11,24,70] in edge devices, mobile, and the cloud.…”
Section: Introductionmentioning
confidence: 99%